Multi-Agent Maximization of a Monotone Submodular Function via Maximum Consensus

Navid Rezazadeh
Mechanical and Aerospace Eng. Dept.
University of California Irvine
Irvine, CA
[email protected] &Solmaz S. Kia
Mechanical and Aerospace Eng. Dept.
University of California Irvine
Irvine, CA
[email protected]

Abstract

Constrained submodular set function maximization problems often appear in multi-agent decision-making problems with a discrete feasible set. A prominent example is the problem of multi-agent mobile sensor placement over a discrete domain. However, submodular set function optimization problems are known to be NP-hard. In this paper, we consider a class of submodular optimization problems that consists of maximization of a monotone and submodular set function subject to a uniform matroid constraint over a group of networked agents that communicate over a connected undirected graph. Our objective is to obtain a distributed suboptimal polynomial-time algorithm that enables each agent to obtain its respective policy via local interactions with its neighboring agents. Our solution is a fully distributed gradient-based algorithm using the multilinear extension of the submodular set functions and exploiting a maximum consensus scheme. This algorithm results in a policy set that when the team objective function is evaluated at worst case the objective function value is in $1-1/{\textup{e}}-O(1/T)$ of the optimal solution. An example demonstrates our results.

1 Introduction

In recent years, optimal multi-agent sensor placement/dispatch to cover areas for sampling/observation objectives has been of great interest in the robotic community to solve various coverage, exploration, and monitoring problems. Depending on the problem setting, the planning space can be continuous [1, 2, 3, 4] or discrete [5, 6, 7, 8]. In this paper, our interest is in planning in discrete space. Many optimal discrete policy-making for multi-agent sensor placement problems appear in the form of maximizing a utility for the team [9, 10, 11, 12, 13, 5]. A particular subset of these problems that we are interested in is in the form of a set-valued function maximization problem described by

\displaystyle\underset{\mathcal{R}\in\mathcal{I}\subset\mathcal{P}}{\textup{max}}f(\mathcal{R}),

(1)

where $\mathcal{I}\subset 2^{\mathcal{P}}$ is the admissible set of subsets originated from the ground policy set $\mathcal{P}$ , which is the union of the local policy set of the agents; see Fig. 1 for an example.

Problem (1) is known to be NP hard [14]. However, when the objective function is monotone increasing and submodular set function, it is well-known that the sequential greedy algorithm [15] delivers a polynomial-time suboptimal solution with guaranteed optimality bound. For large scale submodular maximization problems, reducing the size of the problem through approximations [16] or using several processing units leads to a faster sequential greedy algorithm, however, with some sacrifices on the optimality bound [17, 18, 19, 20].

When the set that the agents can choose their policy from is the same for all agents and the agents are homogeneous, the feasible set $\mathcal{I}$ of (1) can be spanned by a uniform matroid constraint¹¹1see Section 3 for a formal definition of uniform matroid constraint.. For problems with uniform matroid constraint, the sequential greedy algorithm delivers a suboptimal solution whose utility value is no worse than $1-\frac{1}{\textup{e}}$ times the optimal utility value [14]. On the other hand, when the local policy set of the agents is disjoint and/or the agents are heterogeneous, and each agent has to choose one policy from its local set, the feasible set $\mathcal{I}$ of (1) can be spanned by the so-called partitioned matroid constraint²²2see Section 3 for a formal definition of partitioned matroid constraint.. For problems with partitioned matroid constraint, the sequential greedy algorithm delivers a suboptimal solution whose utility value is no worse than $\frac{1}{2}$ times the optimal utility value [15]. For problems with partitioned matroid constraint, more recently, a suboptimal solution with a better optimality gap is proposed in [21, 22, 23, 24] using the multilinear continuous relaxation of a submodular set function. The multilinear continuous relaxation of a submodular set function defined on the vertices of the $n$ -dimensional hypercube facilitates achieves better optimality bounds for the maximization problem (1) by allowing to use a continuous gradient-based optimization algorithm. That is, the relaxation transforms the main discrete problem into a continuous optimization problem with linear constraints. The optimality bound achieved by the algorithm proposed in [21] is $1-\frac{1}{\textup{e}}$ bound for a partitioned matroid constraint, which outperforms the sequential greedy algorithm whose optimally bound is $1/2$ . However, this solution requires a central authority to find the optimal solution. In sensor placement problems, when the agents are self-organizing autonomous mobile agents with communication and computation capabilities, it is desired to solve the optimization problem 1 in a distributed way, without involving a central authority. A distributed multilinear extension based continuous algorithm that uses an average consensus scheme between the agents to solve (1) subject to partitioned matroid constraint is proposed in [25]. However, the proposed algorithm assumes that access to the exact multilinear extension of the utility function is available, whose calculation is exponentially hard.

In this paper, motivated by the improved optimality gap of multilinear continuous relaxation based algorithms, we develop a distributed implementation of the algorithm of [21] over a connected undirected graph to obtain a suboptimal solution for (1) subject to a partitioned matroid constraint. In this algorithm, to manage the computational cost of constructing the multilinear extension of the utility function, we use a sampled based evaluation of the multilinear extension and propose a gradient-based algorithm, which uses a maximum consensus message passing scheme over the communication graph. Our algorithm uses a fully distributed rounding technique to compute the final solution of each agent. Through rigorous analysis, we show that our proposed distributed algorithm achieves, with a known probability, a $1-\frac{1}{\textup{e}}-O(1/T)$ optimality bound, where $T$ is the number of times agents communicated over the network. A numerical example demonstrates our results.

The organization of this paper is as follows. In Section 2 we introduce the notation used throughout the paper. Section 3 provides the necessary background on submodularity and submodular functions. We introduce our proposed algorithm in Section 5. A demonstration of our results is provided in Section 6. Appendix A gives the proof of our main results in Section 5. Appendix B gives the auxiliary results that we use in establishing the proof of our main results.

2 Notation

We denote the vectors with bold small font. The $p^{\textup{th}}$ element of vector $\boldsymbol{\mathbf{x}}$ is denoted by $x_{p}$ . We denote the inner product of two vectors $\boldsymbol{\mathbf{x}}$ and $\boldsymbol{\mathbf{y}}$ with appropriate dimensions by $\boldsymbol{\mathbf{x}}.\boldsymbol{\mathbf{y}}$ . Furthermore, we use $\boldsymbol{\mathbf{1}}$ as a vector of ones with appropriate dimension. We denote sets with capital curly font. For a $\mathcal{P}=\{1,\cdots,n\}$ and a vector $\boldsymbol{\mathbf{x}}\in{\mathbb{R}}_{\geq 0}^{n}$ with $0\leq x_{p}\leq 1$ , the set $\mathcal{R}_{\boldsymbol{\mathbf{x}}}$ is a random set where $p\in\mathcal{P}$ is in $\mathcal{R}_{\boldsymbol{\mathbf{x}}}$ with the probability $x_{p}$ . Hence, we call such vector $\boldsymbol{\mathbf{x}}$ as membership probability vector. Furthermore, for $\mathcal{S}\subset\mathcal{P}$ , $\boldsymbol{\mathbf{1}}_{\mathcal{S}}\in\{0,1\}^{n}$ is the vector whose $p^{\textup{th}}$ element is $1$ if $p\in\mathcal{S}$ and $0$ otherwise; we call $\boldsymbol{\mathbf{1}}_{\mathcal{S}}$ the membership indicator vector of set $\mathcal{S}$ . $|x|$ is the absolute value of $x\in$ . By overloading the notation, we also use $|\mathcal{S}|$ as the cordiality of set $\mathcal{S}$ . We denote a graph by $\mathcal{G}(\mathcal{A},\mathcal{E})$ where $\mathcal{A}$ is the node set and $\mathcal{E}\subset\mathcal{A}\times\mathcal{A}$ is the edge set. $\mathcal{G}$ is undirected if and only $(i,j)\in\mathcal{E}$ means that agents $i$ and $j$ can exchange information. An undirected graph is connected if there is a path from each node to every other node in the graph. We denote the set of the neighboring nodes of node $i$ by $\mathcal{N}_{i}=\{j\in\mathcal{A}|(i,j)\in\mathcal{E}\}$ . We also use $d(\mathcal{G})$ to show the diameter of the graph.

Given a set $\mathcal{F}\subset\mathcal{X}\times$ and an element $(p,\alpha)\in\mathcal{X}\times$ we define the addition operator $\oplus$ as $\mathcal{F}{\oplus}\{(p,\alpha)\}=\{(u,\gamma)\in\mathcal{X}\times\,|\,(u,\gamma)\in\mathcal{F},u\not=p\}\cup\{(u,\gamma+\alpha)\in\mathcal{X}\times\,|\,(u,\gamma)\in\mathcal{F},u=p\}\cup\{(p,\alpha)\in\mathcal{X}\times\,|\,(p,\gamma)\not\in\mathcal{F}\}$ . Given a collection of sets $\mathcal{F}_{j}\in\mathcal{X}\times$ , $j\in\mathcal{N}$ , we define the max-operation over these collection as $\underset{j\in\mathcal{N}}{\textup{MAX}}\,\,\mathcal{F}_{j}=\{(u,\gamma)\in X\times|(u,\gamma)\in\bar{\mathcal{F}}\text{~{}s.t.~{}}\gamma=\underset{(u,\alpha)\in\bar{\mathcal{F}}}{\textup{max}}\alpha\}$ , where $\bar{\mathcal{F}}=\bigcup_{j\in\mathcal{N}}\mathcal{F}_{j}.$

3 Submodular Functions

A set function $f:2^{\mathcal{P}}\to{\mathbb{R}}_{\geq 0}$ is monotone increasing if for sets $\mathcal{P}_{1}\subset\mathcal{P}_{2}\subset\mathcal{P}$ ,

\displaystyle f(\mathcal{P}_{1})\leq f(\mathcal{P}_{2}),

holds. Furthermore, defining $\Delta_{f}(p|\mathcal{S})=f(\mathcal{S}\cup\{p\})-f(\mathcal{S}),\,\,\mathcal{S}\subset\mathcal{P}$ , the set function $f$ is submoular if for $p\in\mathcal{P}\setminus\mathcal{P}_{2}$ ,

\displaystyle\Delta_{f}(p|\mathcal{P}_{1})\geq\Delta_{f}(p|\mathcal{P}_{2}),

(2)

holds, which shows the diminishing return of a set function. Furthermore, a set function is normalized if $f(\emptyset)=0$ .

In what follows without loss of generality we assume that the ground set $\mathcal{P}$ is equal to $\{1,\cdots,n\}$ . Submodular functions are functions assigning values to all subsets of a finite set $\mathcal{P}$ . Using the set membership indicator, equivalently, we can regard them as functions on the boolean hypercube, $f:\{1,0\}^{n}\to$ . For a submodular function $f:2^{\mathcal{P}}\to{\mathbb{R}}_{\geq 0}$ , its multilinear extension $F:[0,1]^{n}\to{\mathbb{R}}_{\geq 0}$ in the continuous space is

\displaystyle F(\boldsymbol{\mathbf{x}})=\sum_{\mathcal{R}\subset\mathcal{P}}f(\mathcal{R})\prod_{p\in\mathcal{R}}x_{p}\prod_{p\not\in\mathcal{R}}(1-x_{p}),\quad~{}~{}\boldsymbol{\mathbf{x}}\in[0,1]^{n}.

(3)

The multilinear extension $F$ of set function $f$ is a unique multilinear function agreeing with f on the vertices of the hypercube. The multilinear extension function has the following properties.

Lemma 3.1 (See [21])

If $f$ is non-decreasing, then $\frac{\partial F}{\partial x_{p}}\geq 0$ for all $p\in\mathcal{P}$ , everywhere in $[0,1]^{n}$ (monotonicity of $F$ ). If $f$ is submodular, then $\frac{\partial^{2}F}{\partial x_{p}x_{q}}\leq 0$ for all $p,q\in\mathcal{P}$ , everywhere in $[0,1]^{n}$ .

An alternative way to perceive $F$ is to observe that it is indeed equivalent to

\displaystyle F(\boldsymbol{\mathbf{x}})=\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}})],

(4)

where $\mathcal{R}_{\boldsymbol{\mathbf{x}}}\subset\mathcal{P}$ is a set where each element $p\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}$ appears independently with the probabilities $x_{p}$ . Then, it can be shown that taking the derivatives of $F(\boldsymbol{\mathbf{x}})$ yields

\displaystyle\frac{\partial F}{\partial x_{p}}(\boldsymbol{\mathbf{x}})=\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{p\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\setminus\{p\})],

(5)

and

		$\displaystyle\frac{\partial^{2}F}{\partial x_{p}\partial x_{q}}(\boldsymbol{\mathbf{x}})=\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{p,q\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{q\}\setminus\{p\})$
		$\displaystyle\quad\quad\quad\quad-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{p\}\setminus\{q\})+f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\setminus\{p,q\})].$		(6)

Constructing $F(\boldsymbol{\mathbf{x}})$ in (3) requires the knowledge of $f(\mathcal{R})$ for all $\mathcal{R}\subset\mathcal{P}$ , which becomes computationally intractable when the size of ground set $\mathcal{P}$ increases. However, with the stochastic interpretation (4) of the multilinear extension and its derivatives, if enough samples are drawn according to $\boldsymbol{\mathbf{x}}$ , we can obtain an estimate of $F(\boldsymbol{\mathbf{x}})$ with a reasonable computational cost. We can use Chernoff-Hoeffding’s inequality to quantify the quality of this estimation given the number of samples.

Theorem 3.1 (Chernoff-Hoeffding’s inequality [26])

Consider a set of $K$ independent random variables $X_{1},...,X_{K}$ where $a<X_{i}<b$ . Letting $S_{K}=\sum_{i=1}^{K}X_{i}$ , then

\displaystyle\mathbb{P}[|S_{K}-\mathbb{E}[S_{K}]|>\delta]<2\textup{e}^{-\frac{2\delta^{2}}{K(b-a)^{2}}}.

Problems such as sensor placement with a limited number of sensors and ample places of possible sensor placement can be formulated as a set value optimization (1) where the feasible set constraint $\mathcal{I}$ is in the form of matroid. A matroid is defined as the pair $\mathcal{M}=\{\mathcal{P},\mathcal{I}\}$ with $\mathcal{I}\subset 2^{\mathcal{P}}$ such that

•

$\mathcal{B}\in\mathcal{I}$ and $\mathcal{A}\subset\mathcal{B}$ then $\mathcal{A}\in\mathcal{I}$
•

$\mathcal{A},\mathcal{B}\in\mathcal{I}$ and $|\mathcal{B}|>|\mathcal{A}|$ , then there exists $x\in\mathcal{B}\setminus\mathcal{A}$ such that $\mathcal{A}\cup{x}\in\mathcal{I}$

One of the known matroids is uniform matroid $\mathcal{I}=\{\mathcal{R}\subset\mathcal{P},|\mathcal{R}|\leq k\}.$ A sensor placement scenario where there are $k$ sensors to be placed in locations selected out of possible locations $\mathcal{P}$ can be described by a uniform matroid. On the other hand, when there is a set of agents $\mathcal{A}$ , each with multiple sensors $k_{i}$ , $i\in\mathcal{A}$ , that they can place in a set of specific and non-overlapping part $\mathcal{P}_{i}$ of a discrete space, this constrained can be described by the partitioned matroid $\mathcal{M}=\{\mathcal{P},\mathcal{I}\}$ , where $\mathcal{I}=\{\mathcal{R}\subset\mathcal{P},|\mathcal{R}\cap\mathcal{P}_{i}|\leq k_{i}\}$ with $\bigcup\limits_{i=1}^{N}\mathcal{P}_{i}=\mathcal{P}$ and $\mathcal{P}_{i}\cap\mathcal{P}_{j}=\emptyset,\,\,i\neq j$ .

A matroid polytop is a convex hull defined as

\displaystyle P(\mathcal{M})=\{\boldsymbol{\mathbf{x}}\in{\mathbb{R}}_{\geq 0}^{|\mathcal{P}|},\forall\mathcal{Q}\in\mathcal{P},\sum_{p\in\mathcal{Q}}x_{p}\leq r_{\mathcal{M}}(\mathcal{Q})\}

with $r_{\mathcal{M}}$ to be the rank function of the matroid defined as

\displaystyle r_{\mathcal{M}}(\mathcal{Q})=\textup{max}\{|\mathcal{S}|,\mathcal{S}\subset\mathcal{Q},\mathcal{S}\in\mathcal{I}\}.

For the partition matroid with $k_{i}=1$ which is the focus of this paper, the convex hull is such that for $\boldsymbol{\mathbf{x}}=[\boldsymbol{\mathbf{x}}_{1}^{\top},...,\boldsymbol{\mathbf{x}}_{N}^{\top}]^{\top}$ with $\boldsymbol{\mathbf{x}}_{i}\in{\mathbb{R}}_{\geq 0}^{|\mathcal{P}_{i}|}$ associated with membership probability of policies in $\mathcal{P}_{i}$ , we have $\boldsymbol{\mathbf{1}}^{\top}\boldsymbol{\mathbf{x}}_{i}\leq 1$ , i.e.,

\displaystyle P(\mathcal{M})=\{\boldsymbol{\mathbf{x}}\in{\mathbb{R}}_{\geq 0}^{|\mathcal{P}|}\,\big{|}\,\sum\nolimits_{p=1}^{|\mathcal{P}_{i}|}x_{ip}\leq 1,\forall i\in\mathcal{A}\}.

(7)

Refer to caption — Figure 1: Let the policy set of each mobile sensor $i\in\mathcal{A}$ be $\mathcal{P}_{i}=\{(i,p)|p\in\mathcal{B}_{i}\}$ , where $\mathcal{B}_{i}\subset\mathcal{B}$ is the set of the allowable sensor placement points for agent $i\in\mathcal{A}$ out of all the sensor placement points $\mathcal{B}$ . Note that by definition, for any two agent $i,j\in\mathcal{A}$ , $\mathcal{P}_{i}\cap\mathcal{P}_{j}=\emptyset$ . The sensors are heterogeneous, in the sense that the size of their sensing zone is different. The objective is to place the sensors in points in $\mathcal{B}$ such that the total number of the observed points of interest is maximized. The utility function, the sum of observed points, is known to be a monotone and increasing submodular function of the agent’s sensing zone [27]. This sensor placement problem can be formalized as the optimization problem (8). The agents are communicating over a connected undirected graph and their objective is to obtain their respective placement points by interacting only with their communicating neighbors.

4 Problem Statement

We consider a group of $\mathcal{A}=\{1,...,N\}$ with communication and computation capabilities interacting over a connected undirected graph $\mathcal{G}(\mathcal{A},\mathcal{E})$ . These agents aim to solve in a distributed manner the optimization problem


	$\displaystyle\underset{\mathcal{R}\in\mathcal{I}}{\textup{max}}f(\mathcal{R})\quad\textup{s.t.}$	(8a)
$\displaystyle\mathcal{I}=\big{\{}$	$\displaystyle\mathcal{R}\subset\mathcal{P}\,\big{\|}\,\,\|\mathcal{R}\cap\mathcal{P}_{i}\|\leq 1,~{}i\in\mathcal{A}\big{\}},$	(8b)

where utility function $f:2^{\mathcal{P}}\to_{\geq 0}$ is monotone increasing and submodular set function over the discrete policy space $\mathcal{P}=\bigcup\limits_{i=1}^{N}\mathcal{P}_{i}$ , with $\mathcal{P}_{i}$ being the policy space of agent $i\in\mathcal{A}$ , which is only known to agent $i$ . In this paper, we work in the value oracle model where the only access to the utility function is through a black box returning $f(\mathcal{R})$ for a given set $\mathcal{R}$ . Every agent can obtain the value of the utility function at any subset $\mathcal{R}\in\mathcal{P}$ . The constraint (8b) is the partitioned matroid $\mathcal{M}=\{\mathcal{P},\mathcal{I}\}$ , which ensures that only one policy per agent is selected from each local policy set $\mathcal{P}_{i}$ , $i\in\mathcal{A}$ . An example application scenario of our problem of interest is shown in Fig. 1. Without loss of generality, to simplify notation, we assume that the policy space is $\mathcal{P}=\{1,...,n\},$ and is sorted agent-wise with $1\in\mathcal{P}_{1}$ and $n\in\mathcal{P}_{N}$ .

Finding the optimal solution $\mathcal{R}^{\star}\in\mathcal{I}$ of (8) even in central form is NP-Hard [28]. The computational time of finding the optimizer set increases exponentially with $N$ [29]. The well-known sequential greedy algorithm finds a suboptimal solution $\mathcal{R}_{\textup{SG}}$ for (8) with the optimality bound of $f(\mathcal{R}_{\textup{SG}})\geq\frac{1}{2}f(\mathcal{R}^{\star})$ , i.e., a $1/2$ -approximation at worst case [28]. More recently, by using the multilinear extension of the submodular utility functions, Vondrák [21] developed a randomized centralized continuous greedy algorithm which achieves a $(1-1/\text{e})-O(1/T)$ -approximate for set value optimization (8) in the value oracle model, see Algorithm 1³³3Pipage rounding is a polynomial time algorithm which moves a fractional point $\boldsymbol{\mathbf{x}}$ on a hypercube to a integral point $\hat{\boldsymbol{\mathbf{x}}}$ on the same hypercube such that $f(\hat{\boldsymbol{\mathbf{x}}})\geq f(\boldsymbol{\mathbf{x}})$ . Our objective in this paper is to develop a distributed implementation of Algorithm 1 to solve (8) for when agents interact over a connected graph $\mathcal{G}$ . Recall that in our problem setting, every agent $i\in\mathcal{A}$ can evaluate the utility function for a given $\mathcal{R}\subset\mathcal{P}$ but it has access only to its own policy space $\mathcal{P}_{i}$ .

Algorithm 1 Practical implementation of the continuous greedy process [21].

\boldsymbol{\mathbf{x}}\leftarrow\boldsymbol{\mathbf{0}},\,\,\mathbf{\text{Init:}}~{}t\leftarrow 1

2:while

t\leq T

3: Draw

K

samples of

\mathcal{R}

from

\mathcal{P}

according to

\boldsymbol{\mathbf{x}}

4: for

p\in\mathcal{P}

5: Estimate

w_{p}\sim\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{p\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\setminus\{p\})]

6: end for

7: Solve for

\mathcal{R}^{\star}=\underset{\mathcal{R}\in\mathcal{I}}{\textup{argmax}}\,\,\sum_{p\in\mathcal{R}}w_{p}.

8: Update membership vector as

\boldsymbol{\mathbf{x}}\leftarrow\boldsymbol{\mathbf{x}}+\frac{1}{T}\boldsymbol{\mathbf{1}}_{\mathcal{R}^{\star}}

t\leftarrow t+1

10:end while

11:Use Pipage rounding to convert the fractional solution

\boldsymbol{\mathbf{x}}

to an integral solution.

Note: $\boldsymbol{\mathbf{1}}_{\mathcal{R}^{\star}}$ is the $\boldsymbol{\mathbf{v}}$ in (10).

5 A polynomial-time distributed multi-agent randomized continuous greedy algorithm

Our proposed distributed multi-agent randomized continuous greedy algorithm to solve the set value optimization problem (8) over a connected graph $\mathcal{G}$ is given in Algorithm 2, whose convergence guarantee and suboptimality gap is given in Theorem 5.1 below. To provide the insight into construction of Algorithm 2, we first review the idea behind the central suboptimal solution via Algorithm 1 following [21]. We also provide some intermediate results that will be useful in establishing our results.

5.1 A short overview of the central continuous greedy process

As we mentioned earlier the continuous multilinear extension is a relaxation strategy that extends a submodular function $f(\mathcal{R})$ , which is defined on the vertices of the $n-$ dimensional hypercube $\{0,1\}^{n}$ to a continuous multilinear function $F$ defined on $[0,1]^{n}$ . The two functions evaluate identically for any vector $\boldsymbol{\mathbf{x}}\in[0,1]^{n}$ that is the membership indicator vector of a set $\mathcal{R}\in\mathcal{P}$ . Then, by way of a process that runs continuously, depending only on local properties of $F$ , we can produce a point $\boldsymbol{\mathbf{x}}\in P(\mathcal{M})$ to approximate the optimum $OPT=\underset{\boldsymbol{\mathbf{x}}\in P(\mathcal{M})}{\textup{max}}F(\boldsymbol{\mathbf{x}})$ (here recall (7)). The proposal in [21] is to move in the direction of a vector constrained by $P(\mathcal{M})$ which maximizes the local gain. To understand the logic behind Algorithm 1 let us review the conceptual steps of this continuous greedy process. [21] views the process as a particle starting at $\boldsymbol{\mathbf{x}}(0)=\boldsymbol{\mathbf{0}}$ and following the flow

\displaystyle\frac{\text{d}\boldsymbol{\mathbf{x}}}{\text{d}t}=\boldsymbol{\mathbf{v}}(\boldsymbol{\mathbf{x}})~{}~{}\text{where}~{}~{}\boldsymbol{\mathbf{v}}(\boldsymbol{\mathbf{x}})=\underset{\boldsymbol{\mathbf{v}}\in P(\mathcal{M})}{\arg\max}(\boldsymbol{\mathbf{v}}.\nabla F(\boldsymbol{\mathbf{x}})).

(9)

over a unit time interval $[0,1]$ . We note that $\boldsymbol{\mathbf{x}}(t)$ for $t\in[0,1]$ is contained in $P(\mathcal{M})$ , since it is a convex combination of vectors in $P(\mathcal{M})$ . Now consider a point $\boldsymbol{\mathbf{x}}$ along the trajectory of our flow (9) and assume that $\boldsymbol{\mathbf{x}}^{\star}$ is the true optimum $OPT=F(\boldsymbol{\mathbf{x}}^{\star})$ . Now consider a direction $\boldsymbol{\mathbf{v}}^{\star}=\max\{\boldsymbol{\mathbf{x}}^{\star}-\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{0}}\}$ , which is a nonnegative vector. Because $0\leq\boldsymbol{\mathbf{v}}^{\star}\leq\boldsymbol{\mathbf{x}}^{\star}\in P(\mathcal{M})$ , then $\boldsymbol{\mathbf{v}}^{\star}\in P(\mathcal{M})$ . By virtue of Lemma 3.1, $F$ is monotone increasing, therefore, using $\max\{\boldsymbol{\mathbf{x}}^{\star}-\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{0}}\}=\max\{\boldsymbol{\mathbf{x}}^{\star},\boldsymbol{\mathbf{x}}\}-\boldsymbol{\mathbf{x}}$ we have $F(\boldsymbol{\mathbf{x}}+\boldsymbol{\mathbf{v}}^{\star})=F(\max\{\boldsymbol{\mathbf{x}}^{\star},\boldsymbol{\mathbf{x}}\})\geq F(\boldsymbol{\mathbf{x}}^{\star})$ . However, $\boldsymbol{\mathbf{x}}+\boldsymbol{\mathbf{v}}^{\star}$ does not belong to $P(\mathcal{M})$ necessarily. Therefore, lets consider $F(\boldsymbol{\mathbf{x}}+\zeta\boldsymbol{\mathbf{v}}^{\star})$ for $\zeta\geq 0$ . It can be shown that $F(\boldsymbol{\mathbf{x}}+\zeta\boldsymbol{\mathbf{v}}^{\star})$ is concave in $\zeta$ and $\frac{\text{d}F}{\text{d}\zeta}$ is non-increasing. Thus, it can be established that $F(\boldsymbol{\mathbf{x}}+\boldsymbol{\mathbf{v}}^{\star})-F(\boldsymbol{\mathbf{x}})\leq\frac{\text{d}F}{\text{d}\zeta}|_{\zeta=0}=\boldsymbol{\mathbf{v}}^{\star}.\nabla F(\boldsymbol{\mathbf{x}})$ . But, since $\boldsymbol{\mathbf{v}}^{\star}\in P(\mathcal{M})$ , and $\boldsymbol{\mathbf{v}}\in P(\mathcal{M})$ that is used to generate $\boldsymbol{\mathbf{v}}$ maximizes $\boldsymbol{\mathbf{v}}.\nabla F(\boldsymbol{\mathbf{x}})$ , we can write $\boldsymbol{\mathbf{v}}.\nabla F(\boldsymbol{\mathbf{x}})\geq\boldsymbol{\mathbf{v}}^{\star}.\nabla F(\boldsymbol{\mathbf{x}})\geq F(\boldsymbol{\mathbf{x}}+\boldsymbol{\mathbf{v}}^{\star})-F(\boldsymbol{\mathbf{x}})\leq OPT-F(\boldsymbol{\mathbf{x}})$ . Now we note that $\frac{\text{d}F}{\text{d}t}=\boldsymbol{\mathbf{v}}(\boldsymbol{\mathbf{x}}).\nabla F(\boldsymbol{\mathbf{x}})\geq OPT-F(\boldsymbol{\mathbf{x}}(t)).$ Therefore, given $\boldsymbol{\mathbf{x}}(0)=\boldsymbol{\mathbf{0}}$ , using the Comparison Lemma [30], we can conclude the $F(\boldsymbol{\mathbf{x}})\geq(1-\text{e}^{-t})OPT$ , and thus $\boldsymbol{\mathbf{x}}(1)\in P(\mathcal{M})$ and also $F(\boldsymbol{\mathbf{x}}(1))\geq(1-1/\text{e})OPT$ . In the second stage of the algorithm, the fractional solution $\boldsymbol{\mathbf{x}}(1)$ is rounded to a point in $\{0,1\}^{n}$ by use of Pipage rounding method, see [31] for more details about Pipage rounding. The aforementioned exposition is the conceptual design behind the continuous greedy process. The algorithm 1is a practical implementation achieved by the use of a numerical iterative process

\displaystyle\boldsymbol{\mathbf{x}}(t+1)=\boldsymbol{\mathbf{x}}(t)+\frac{1}{T}\boldsymbol{\mathbf{v}}(t),

(10)

and use of sampling to compute $\nabla F(\boldsymbol{\mathbf{x}})$ and consequently $\boldsymbol{\mathbf{v}}(t)$ , see [21] for more details. In what follows, we explain a practical distributed implementation of the the continuous greedy process, which is realized as Algorithm 2, and is inspired by this central solution.

Algorithm 2 Discrete Distributed implementation of the continuous greedy process.

\mathbf{\text{Init:}}~{}\bar{\mathcal{P}}\leftarrow\emptyset,\,\,\mathcal{F}_{i}\leftarrow\emptyset,\,\,t\leftarrow 1

2:while

t\leq T

3: for

i\in\mathcal{A}

4: Draw

K_{i}

sample policy sets

\mathcal{R}

such that

q\in\mathcal{R}

with the probability

\alpha

for all

(q,\alpha)\in\mathcal{F}_{i}

5: for

p\in\mathcal{P}_{i}

6: Compute

w^{i}_{p}\sim\mathbb{E}[f(\mathcal{R}\cup\{p\})-f(\mathcal{R}\setminus\{p\})]

using the policy sample sets of step

4

7: end for

8: Solve for

p^{\star}=\underset{p\in\mathcal{P}_{i}}{\textup{argmax}}\,\,w^{i}_{p}.

\mathcal{F}_{i}^{-}\leftarrow\mathcal{F}_{i}\oplus\{(p^{\star},\frac{1}{T})\}.

10: Broadcast

\mathcal{F}_{i}^{-}

to the neighbors

\mathcal{N}_{i}

11:

\mathcal{F}_{i}\leftarrow\underset{j\in\mathcal{N}_{i}\cup\{i\}}{\textup{MAX}}\mathcal{F}_{j}^{-}

12: end for

13:

t\leftarrow t+1

14:end while

15:for

i\in\mathcal{A}

16: Sample one policy

\bar{p}\in\mathcal{P}_{i}

using

\mathcal{F}_{i}

17:

\bar{\mathcal{P}}\leftarrow\bar{\mathcal{P}}\cup\{\bar{p}\}

18:end for

5.2 Design and analysis of the distributed continuous greedy process

We start off our design and analysis of the distributed continuous greedy process by introducing our notation and the set of computations that agents carry out locally using their local information and interaction with their neighbors. The algorithm is performed over the finite time horizon of $T$ steps. Let $\mathcal{F}_{i}(t)\subset\mathcal{P}\times[0,1]$ be the information set of agent $i$ at time step $t$ , initialized at $\mathcal{F}_{i}(0)=\emptyset$ . For, any couple $(p,\alpha)\in\mathcal{F}_{i}(t)$ the first element is the policy and the second element is the corresponding membership probability. We let $\boldsymbol{\mathbf{x}}_{i}(t)\in^{n}$ (recall $|\mathcal{P}|=n$ ) be the local copy of the membership probability of our suboptimal solution of (8) at agent $i$ at time step $t$ , defined according to

\displaystyle x_{ip}(t)=\begin{cases}\alpha,&(p,\alpha)\in\mathcal{F}_{i}(t),\\ 0&\text{otherwise},\end{cases}\qquad p\in\mathcal{P}.

(11)

Recall that $\mathcal{P}=\{1,...,n\}$ and it is sorted agent-wise with $1\in\mathcal{P}_{1}$ and $n\in\mathcal{P}_{N}$ . Hence, $\boldsymbol{\mathbf{x}}_{i}(t)=[\hat{\boldsymbol{\mathbf{x}}}_{i1}^{\top}(t),\cdots,\boldsymbol{\mathbf{x}}_{ii}^{\top}(t),\cdots,\hat{\boldsymbol{\mathbf{x}}}_{iN}^{\top}(t)]^{\top}$ where $\boldsymbol{\mathbf{x}}_{ii}(t)\in{\mathbb{R}}_{\geq 0}^{|\mathcal{P}_{i}|}$ is the membership probability vector of agent $i$ ’s own policy at iteration $t$ , while $\hat{\boldsymbol{\mathbf{x}}}_{ij}(t)\in{\mathbb{R}}_{\geq 0}^{|\mathcal{P}_{j}|}$ is the local estimate of the membership probability vector of agent $j$ by agent $i$ . At time step $t$ agent $i$ solves the optimization problem

\displaystyle\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)=\underset{\boldsymbol{\mathbf{y}}\in P_{i}(\mathcal{M})}{\textup{argmax}}\boldsymbol{\mathbf{y}}.\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))

(12)

with

	$\displaystyle P_{i}(\mathcal{M})=\Big{\{}[\boldsymbol{\mathbf{y}}_{1}^{\top},\cdots,\boldsymbol{\mathbf{y}}_{N}^{\top}]^{\top}\in$	$\displaystyle{\mathbb{R}}_{\geq 0}^{n}\,\Big{\|}\,\boldsymbol{\mathbf{1}}^{\top}.\boldsymbol{\mathbf{y}}_{i}\leq 1,$
		$\displaystyle~{}~{}~{}\boldsymbol{\mathbf{y}}_{j}=\boldsymbol{\mathbf{0}},~{}j\in\mathcal{A}\backslash\{i\}\Big{\}}.$		(13)

The term $\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))$ is the estimate of $\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))$ which is calculated by taking $K_{i}$ samples of set $\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}$ according to membership probability vector $\boldsymbol{\mathbf{x}}_{i}(t)$ . Recall (5). Hence, $\frac{\partial F}{\partial x_{p}}$ is estimated by averaging $f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\cup\{p\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\setminus\{p\})$ over the samples. We denote the $p$ th element of $\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))$ by $w_{p}^{i}$ , and represent it by

w_{p}^{i}\sim\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\cup\{p\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\setminus\{p\})].

(14)

We note that given the definition of $P_{i}(\mathcal{M})$ , to compute (12), we only need $w_{p}^{i}$ for $p\in\mathcal{P}_{i}$ .

Remark 5.1 (Local computation of $w_{p}^{i}$ , $p\in\mathcal{P}_{i}$ , by agent $i$ )

Given the definition (11), we note that $w_{p}^{i}$ , an estimate of $\frac{\partial F}{\partial x_{p}}$ can be obtained from drawing $K_{i}$ sample policy sets $\mathcal{R}$ such that $q\in\mathcal{R}$ with the probability $\alpha$ for all $(q,\alpha)\in\mathcal{F}_{i}$ and using

w^{i}_{p}\sim\mathbb{E}[f(\mathcal{R}\cup\{p\})-f(\mathcal{R}\setminus\{p\})],\quad p\in\mathcal{P}_{i}.

(15)

Let each agent propagate its local variable according to

\displaystyle\boldsymbol{\mathbf{x}}_{i}^{-}(t+1)=\boldsymbol{\mathbf{x}}_{i}(t)+\frac{1}{T}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).

(16)

One can make a conceptual connection between (16) and the practical implementation of (9) discussed earlier. Because the propagation is only based on local information of agent $i$ , next, each agent, by interacting with its neighbors, updates its propagated $\boldsymbol{\mathbf{x}}_{i}^{-}(t+1)$ by element-wise maximum seeking

\displaystyle\boldsymbol{\mathbf{x}}_{i}(t+1)=\underset{j\in\mathcal{N}_{i}\cup\{i\}}{\textup{max}}\boldsymbol{\mathbf{x}}_{j}^{-}(t+1).

(17)

Lemma 5.1 below shows that, as one expects, $\boldsymbol{\mathbf{x}}_{ii}(t+1)=\boldsymbol{\mathbf{x}}_{ii}^{-}(t+1)$ , i.e., the corrected component of $\boldsymbol{\mathbf{x}}_{i}$ corresponding to agent $i$ itself is the propagated value maintained at agent $i$ , and not the estimated value of any of its neighbors.

Lemma 5.1

Assume that the agents follow the distributed Algorithm 2. Let $\bar{\boldsymbol{\mathbf{x}}}(t)=\underset{i\in\mathcal{A}}{\textup{max}}\,\,\boldsymbol{\mathbf{x}}_{i}(t)$ where $\boldsymbol{\mathbf{x}}_{i}$ is given in (11). Moreover, Then, $\bar{\boldsymbol{\mathbf{x}}}(t)=[\boldsymbol{\mathbf{x}}_{11}^{\top}(t),\cdots,\boldsymbol{\mathbf{x}}_{NN}^{\top}(t)]^{\top}$ at any time step $t$ .

The proof of Lemma 5.1 is given in Appendix A. We note that it follows from Lemma 5.1 that

\displaystyle\boldsymbol{\mathbf{x}}_{jj}(t)=\boldsymbol{\mathbf{x}}_{jj}^{-}(t),\quad j\in\mathcal{A}.

(18)

In the distributed Algorithm 2 at each time step $t$ each agent has its own belief on the probabilities of the policies that is not necessarily the same as the belief of the other agents. The following result establishes the difference between the belief of the agents.

Proposition 5.1

Let agents follow the distributed Algorithm 2. Then, the vectorized membership probability $\boldsymbol{\mathbf{x}}_{i}(t)$ for each agent $i\in\mathcal{A}$ realized from $\mathcal{F}_{i}(t)$ satisfy


	$\displaystyle 0\leq\frac{1}{N}\boldsymbol{\mathbf{1}}.(\bar{\boldsymbol{\mathbf{x}}}(t)-\boldsymbol{\mathbf{x}}_{i}(t))\leq\frac{1}{T}d(\mathcal{G}),$		(19a)
	$\displaystyle\bar{\boldsymbol{\mathbf{x}}}(t+1)-\bar{\boldsymbol{\mathbf{x}}}(t)=\frac{1}{T}\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t),$		(19b)
	$\displaystyle\frac{1}{N}\boldsymbol{\mathbf{1}}.(\bar{\boldsymbol{\mathbf{x}}}(t+1)-\bar{\boldsymbol{\mathbf{x}}}(t))=\frac{1}{T}.$		(19c)

The proof of Proposition 5.1 is given in the Appendix A. Because $f$ is normal and monotone increasing, we have the guarantees that $w_{p}^{i}\geq 0$ . Therefore, without loss of generality, we know that one realization of $\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)$ in (12) corresponds to $\boldsymbol{\mathbf{1}}_{\{p^{\star}\}}$ where

\displaystyle p^{\star}=\underset{p\in\mathcal{P}_{i}}{\arg\max}\,\,w_{p}^{i}.

(20)

Next, let each agent $i\in\mathcal{A}$ propagate its information set according to

\displaystyle\mathcal{F}_{i}^{-}(t+1)=\mathcal{F}_{i}(t)\oplus\left\{(p^{\star},\frac{1}{T})\right\},

(21)

and update it using a local interaction with its neighbors according to

\displaystyle\mathcal{F}_{i}(t+1)=\underset{j\in\mathcal{N}_{i}\cup\{i\}}{\textup{MAX}}\mathcal{F}_{j}^{-}(t+1).

(22)

By definition to $\oplus$ and $MAX$ operators, we have the guarantees that if $(p,\alpha_{1})$ , then there exists no $\alpha_{2}\neq\alpha_{1}$ that $(p,\alpha_{2})\in\mathcal{F}_{i}$ .

Lemma 5.2

For $\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)=\boldsymbol{\mathbf{1}}_{\{p^{\star}\}}$ , $\boldsymbol{\mathbf{x}}^{-}_{i}(t+1)$ and $\boldsymbol{\mathbf{x}}_{i}(t+1)$ computed from, respectively, (16) and (17) are the same as $\boldsymbol{\mathbf{x}}^{-}_{i}(t+1)$ and $\boldsymbol{\mathbf{x}}_{i}(t+1)$ constructed from, respectively, $\mathcal{F}_{i}^{-}(t+1)$ and $\mathcal{F}_{i}(t+1)$ using (11).

Proof of Lemma 5.2

The proof follows trivially from the definition of the operator $\oplus$ and (11). $\Box$

Initialized by $\mathcal{F}_{i}(0)=\emptyset$ , $i\in\mathcal{A}$ , (20), (21), and (21) where $w_{p}^{i}$ is computed via (15) constitute a distributed iterative process, formally stated by Algorithm 2, that runs for $T$ steps. At the end of these $T$ steps, as stated in Algorithm 2, each agent $i\in\mathcal{A}$ , obtains its suboptimal policy $\bar{\mathcal{P}}_{i}$ by sampling one policy $\bar{p}\in\mathcal{P}_{i}$ with the probability given by $\boldsymbol{\mathbf{x}}_{ii}(T)$ , where for $p\in\mathcal{P}_{i}$ ,

x_{ip}(T)=\begin{cases}\alpha,&(p,\alpha)\in\mathcal{F}_{i}(T),\\ 0&\text{otherwise}.\end{cases}

The following result, whose proof is available in the Appendix A, gives the convergence guarantee and suboptimality gap of Algorithm 2.

Theorem 5.1 (Convergence guarantee and suboptimality gap of Algorithm 2)

Let $f:2^{\mathcal{P}}\to{\mathbb{R}}_{\geq 0}$ be normalized, monotone increasing and submodular set function. Let $\mathcal{S}^{\star}$ to be the optimizer of problem (8). Then, the admissible policy set $\bar{\mathcal{P}}$ , the output of distributed Algorithm 2, satisfies

\displaystyle\left(1\!-\!\frac{1}{\textup{e}}\right)\left(1\!-\!\left(2N^{2}d(\mathcal{G})\!+\!\frac{1}{2}N^{2}+N\right)\frac{1}{T}\right)f(\mathcal{S}^{\star})\!\leq\!\mathbb{E}[f(\bar{\mathcal{P}})],

with the probability of $\left(\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}\right)^{T}$ .

By simplifying the probability statement and dropping the higher order terms, the optimality gap grantee of Theorem 5.1 holds with the probability of at least $1-2T\,n\,\textup{e}^{-\frac{1}{8T^{2}}\underline{K}},\,\,\underline{K}=\min\{K_{1},\cdots,K_{N}\}$ ; note that $1-2T\,n\,\textup{e}^{-\frac{1}{8T^{2}}\underline{K}}\leq\left(\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}\right)^{T}$ . Then, it is clear that the probability improves as $T$ and the $\underline{K}$ , the number of the samples collected by agents, increase.

Remark 5.2 (Extra communication for improved optimality gap)

Replacing the update step (17) with $\boldsymbol{\mathbf{x}}_{i}(t+1)=\boldsymbol{\mathbf{y}}_{i}(d(\mathcal{G}))$ where $\boldsymbol{\mathbf{y}}_{i}(0)=\boldsymbol{\mathbf{x}}_{i}^{-}(t+1)$ and

\displaystyle\boldsymbol{\mathbf{y}}_{i}(m)=\underset{j\in\mathcal{N}_{i}\cup\{i\}}{\textup{max}}\boldsymbol{\mathbf{y}}_{j}(m-1),\quad m\in\{1,\cdots,d(\mathcal{G})\},

i.e., starting with $\boldsymbol{\mathbf{x}}_{i}^{-}(t+1)$ and recursively repeating the update step (17) using the output of the previous recursion for $d(\mathcal{G})$ times, each agent $i\in\mathcal{A}$ arrives at $\boldsymbol{\mathbf{x}}_{i}(t+1)=\bar{\boldsymbol{\mathbf{x}}}(t+1)$ (recall Lemma 5.1). Hence, for this revised implementation, following the proof of Theorem 5.1, we observe that (33) is replaced by $\left|\frac{\partial F}{\partial x_{p}}(\bar{\boldsymbol{\mathbf{x}}}(t))-\frac{\partial F}{\partial x_{p}}({\boldsymbol{\mathbf{x}}}_{i}(t))\right|=0$ , which consequently, leads to

\displaystyle\left(1\!-\!\frac{1}{\textup{e}}\right)\left(1\!-\!\left(\!\frac{1}{2}N^{2}+N\right)\frac{1}{T}\right)f(\mathcal{S}^{\star})\!\leq\!\mathbb{E}[f(\bar{\mathcal{P}})],

(23)

with the probability of $\left(\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}\right)^{T}$ . This improved optimality gap is achieved by $(d(\mathcal{G})-1)T$ extra communication per agent. The optimality bound (23) is the same bound that is achieved by the centralized algorithm of [21]. To implement this revision, Algorithm 2’s step 11 (equivalent to (21)) should be replaced by $\mathcal{F}_{i}=\mathcal{H}_{i}(d(\mathcal{G}))$ , where $\mathcal{H}_{i}(0)=\mathcal{F}_{i}^{-}$ , and

\displaystyle\mathcal{H}_{i}(m)=\underset{j\in\mathcal{N}_{i}\cup\{i\}}{\textup{MAX}}\mathcal{H}_{j}^{-}(m-1),\quad m\in\{1,\cdots,d(\mathcal{G})\}.

(24)

6 Numerical Example

Consider the multi-agent sensor placement problem introduced in Fig. 1 for $5$ agents for which $\mathcal{B}_{i}=\mathcal{B}$ , i.e., each agent is able to move to any of the sensor placement points. This sensor placement problem is cast by optimization problem (8). The field is a 6 unit by 6 unit square and the feasible sensor locations are the 6 by 6 grid in the center square of the field, see Fig. 2. The points of interest are spread around the map (small red dots in Fig. 2) in the total number of $900$ . The sensing zone of the agents $\mathcal{A}=\{a,b,c,d,e\}$ are circles with radii of respectively $\{0.5,0.6,0.7,0.8,1.5\}$ . The agents communicate over a ring graph as shown in Fig. 2. We first solve this problem using our proposed distributed Algorithm 2. The results of the simulation for different iteration and sampling numbers are shown in Table 1. The algorithm produces good results at a modest number of iteration and sampling numbers (e.g. see $T=20$ and $K=500$ ). Fig. 2(g) shows the result of the deployment using Algorithm 2. Next, we solve this algorithm using the sequential greedy algorithm [29] in a decentralized way by first choosing a route $\mathtt{SEQ}={\scriptsize\framebox{1}\to\framebox{2}\to\framebox{3}\to\framebox{4}\to\framebox{5}}$ that visits all the agents, and then giving $\mathtt{SEQ}$ to the agents so they follow $\mathtt{SEQ}$ to share their information in a sequential manner. Figure 2(a)-(f) gives 6 possible $\mathtt{SEQ}$ denoted by the semi-circular arrow inside the networks. The results of running the sequential greedy algorithm over the sequences in Fig. 2(a)-(f) is shown in Table 2.

	10000	500	100	50	10	5	1
100	768	768	718	710	718	716	696
20	768	768	718	710	726	716	696
10	661	640	657	640	634	602	551
5	630	630	634	626	583	608	540
1	456	456	456	456	456	456	456

Table 1: The outcome of Algorithm 2 for different iteration and sampling numbers.

Case	1	2	3	4	5	6
Utility	634	704	699	640	767	760

Table 2: Outcome of sequential greedy algorithm.

What stands out about the sequential greedy algorithm is that the choice of sequence can affect the outcome of the algorithm significantly. We can attribute this inconsistency to the heterogeneity of the sensors’ measurement zone. We can see that when sensor $e$ is given the last priority to make its choice, the sequential greedy algorithm acts poorly. This can be explained by agents with smaller sensing zone picking high-density areas but not being able to cover it fully, see Fig. 3(h) which depicts the outcome of a sequential greedy algorithm using the sequence in Case 1. A simpler example justifying this observation is shown in Fig. 3 with the two disjoint clusters of points and two sensors. One may suggest to sequence the agents from high to low sensing zone order, however this is not necessarily the best choice as we can see in Table 2; the utility of case 6 is less than case 5 (the conjecture of sequencing the agents from strongest to weakest is not valid). Moreover, this ordering may lead to a very long $\mathtt{SEQ}$ over the communication graph. Interestingly, this inconsistency does not appear in solutions of Algorithm 2 where the agents intrinsically are overcoming the importance of a sequence by deciding the position of the sensors over a time horizon of communication and exchanging their information set.

7 Conclusion

We proposed a distributed suboptimal algorithm to solve the problem of maximizing an monotone increasing submodular set function subject to a partitioned matroid constraint. Our problem of interest was motivated by optimal multi-agent sensor placement problems in discrete space. Our algorithm was a practical decentralization of a multilinear extension based algorithm that achieves $(1-1/\textup{e}-O(1/T))$ optimally gap, which is an improvement over $1/2$ optimality gap that the well-known sequential greedy algorithm achieves. In our numerical example, we compared the outcome obtained by our proposed algorithm with that of a decentralized sequential greedy algorithm which is constructed from assigning a priority sequence to the agents. We showed that the outcome of the sequential greedy algorithm is inconsistent and depends on the sequence. However, our algorithm’s outcome due to its iterative nature intrinsically tended to be consistent, which in some ways also explains its better optimally gap over the sequential greedy algorithm. Our future work is to study the robustness of our proposed algorithm to message dropout.

Acknowledgments

This work is supported by NSF award IIS-SAS-1724331.

References

[1] S. Martínez and F. Bullo, “Optimal sensor placement and motion coordination for target tracking,” Automatica, vol. 42, no. 4, pp. 661–668, 2006.
[2] N. Zhou, C. Cassandras, X. Yu, and S. Andersson, “Decentralized event-driven algorithms for multi-agent persistent monitoring tasks,” in IEEE Int. Conf. on Decision and Control, pp. 4064–4069, 2017.
[3] Y. Khazaeni and C. Cassandras, “Event-driven trajectory optimization for data harvesting in multi-agent systems,” IEEE Tran. on Control of Network Systems, 2017.
[4] N. Zhou, X. Yu, S. Andersson, and C. Cassandras, “Optimal event-driven multi-agent persistent monitoring of a finite set of data sources,” IEEE Tran. on Automatic Control, 2018.
[5] N. Rezazadeh and S. S. Kia, “A sub-modular receding horizon approach to persistent monitoring for a group of mobile agents over an urban area,” IFAC-PapersOnLine, vol. 52, no. 20, pp. 217–222, 2019.
[6] X. Duan, M. George, R. Patel, and F. Bullo, “Robotic surveillance based on the meeting time of random walks,” IEEE Tran. on Robotics and Automation, 2020.
[7] S. Welikala and C. Cassandras, “Event-driven receding horizon control for distributed estimation in network systems,” arXiv preprint arXiv:2009.11958, 2020.
[8] S. Alamdari, E. Fata, and L. Smith, “Persistent monitoring in discrete environments: Minimizing the maximum weighted latency between observations,” The International Journal of Robotics Research, vol. 33, no. 1, pp. 138–154, 2014.
[9] N. Mehr and R. Horowitz, “A submodular approach for optimal sensor placement in traffic networks,” in 2018 Annual American Control Conference (ACC), pp. 6353–6358, IEEE, 2018.
[10] H. A, M. Ghasemi, H. Vikalo, and U. Topcu, “Randomized greedy sensor selection: Leveraging weak submodularity,” IEEE Transactions on Automatic Control, 2020.
[11] V. Tzoumas, A. Jadbabaie, and G. J. Pappas, “Sensor placement for optimal kalman filtering: Fundamental limits, submodularity, and algorithms,” in American Control Conference, pp. 191–196, IEEE, 2016.
[12] M. Shamaiah, S. Banerjee, and H. Vikalo, “Greedy sensor selection: Leveraging submodularity,” in IEEE conference on decision and control, pp. 2572–2577, 2010.
[13] S. T. Jawaid and S. Smith, “Submodularity and greedy algorithms in sensor scheduling for linear dynamical systems,” Automatica, vol. 61, pp. 282–288, 2015.
[14] G. Nemhauser, L. Wolsey, and M. Fisher, “An analysis of approximations for maximizing submodular set functions—i,” Mathematical programming, vol. 14, no. 1, pp. 265–294, 1978.
[15] L. Fisher, G. Nemhauser, and L. Wolsey, “An analysis of approximations for maximizing submodular set functions—ii,” in Polyhedral combinatorics, pp. 73–87, Springer, 1978.
[16] K. Wei, R. Iyer, and J. Bilmes, “Fast multi-stage submodular maximization,” in International conference on machine learning, pp. 1494–1502, PMLR, 2014.
[17] B. Mirzasoleiman, A. Karbasi, R. Sarkar, and A. Krause, “Distributed submodular maximization: Identifying representative elements in massive data,” in Advances in Neural Information Processing Systems, pp. 2049–2057, 2013.
[18] B. Mirzasoleiman, M. Zadimoghaddam, and A. Karbasi, “Fast distributed submodular cover: Public-private data summarization,” in Advances in Neural Information Processing Systems, pp. 3594–3602, 2016.
[19] R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani, “Fast greedy algorithms in mapreduce and streaming,” ACM Transactions on Parallel Computing, vol. 2, no. 3, pp. 1–22, 2015.
[20] P. S. Raut, O. Sadeghi, and M. Fazel, “Online dr-submodular maximization with stochastic cumulative constraints,” arXiv preprint arXiv:2005.14708, 2020.
[21] J. Vondrák, “Optimal approximation for the submodular welfare problem in the value oracle model,” in Proceedings of the fortieth annual ACM symposium on Theory of computing, pp. 67–74, 2008.
[22] A. A. Bian, B. Mirzasoleiman, J. Buhmann, and A. Krause, “Guaranteed non-convex optimization: Submodular maximization over continuous domains,” in Artificial Intelligence and Statistics, pp. 111–120, 2017.
[23] A. Mokhtari, H. Hassani, and A. Karbasi, “Stochastic conditional gradient methods: From convex minimization to submodular maximization,” Journal of Machine Learning Research, vol. 21, no. 105, pp. 1–49, 2020.
[24] O. Sadeghi and M. Fazel, “Online continuous dr-submodular maximization with long-term budget constraints,” in International Conference on Artificial Intelligence and Statistics, pp. 4410–4419, 2020.
[25] A. Robey, A. Adibi, B. Schlotfeldt, J. G. Pappas, and H. Hassani, “Optimal algorithms for submodular maximization with distributed constraints,” arXiv preprint arXiv:1909.13676, 2019.
[26] H. W, “Probability inequalities for sums of bounded random variables,” in The Collected Works of Wassily Hoeffding, pp. 409–426, Springer, 1994.
[27] C. Chekuri and A. Kumar, “Maximum coverage problem with group budget constraints and applications,” in Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pp. 72–83, Springer, 2004.
[28] M. Conforti and G. Cornuéjols, “Submodular set functions, matroids and the greedy algorithm: tight worst-case bounds and some generalizations of the rado-edmonds theorem,” Discrete applied mathematics, vol. 7, no. 3, pp. 251–274, 1984.
[29] N. Rezazadeh and S. S. Kia, “A sub-modular receding horizon solution for mobile multi-agent persistent monitoring,” arXiv preprint arXiv:1908.04425, 2019.
[30] H. K. Khalil and J. W. Grizzle, Nonlinear systems, vol. 3. Prentice hall Upper Saddle River, NJ, 2002.
[31] A. A. Ageev and M. I. Sviridenko, “Pipage rounding: A new method of constructing algorithms with proven performance guarantee,” Journal of Combinatorial Optimization, vol. 8, no. 3, pp. 307–328, 2004.

Appendix A: Proof of the Results in Section 5

Proof of Lemma 5.1

Since $f$ is a monotone increasing and submodular set function, we have $f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\cup\{p\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\setminus\{p\})\geq 0$ and hence $\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))$ has positive entries $\forall i\in\mathcal{A}$ . This results in the optimization (12) subject to vector space (5.2) to output vector $\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)\in P_{i}(\mathcal{M})$ that has entries greater or equal to zero. Hence, according to the propagation and update rule (16) and (17), we can conclude that $\boldsymbol{\mathbf{x}}_{ii}(t)$ has increasing elements and only agent $i$ can update it and other agents only copy this value as $\hat{\boldsymbol{\mathbf{x}}}_{ji}(t)$ . Therefor, we can conclude that $\hat{x}_{jip}(t)\leq x_{iip}(t)$ for all $p\in\mathcal{P}_{i}$ which concludes the proof. $\Box$

Proof of Proposition 5.1

$f$ is a monotone increasing and submodular set function therefor $f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\cup\{p\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{i}(t)}\setminus\{p\})\geq 0$ and hence $\widetilde{\nabla}F(\boldsymbol{\mathbf{x}}_{i}(t))$ has positive entries $\forall i\in\mathcal{A}$ . Then, because $\widetilde{\boldsymbol{\mathbf{v}}}_{j}(t)\in P_{j}(\mathcal{M})$ , it follows from (12) that $\widetilde{\boldsymbol{\mathbf{v}}}_{j}(t)$ has non-negative entries, $\widetilde{{v}}_{jp}(t)\geq 0$ , which satisfy $\sum_{p\in\mathcal{P}_{j}}\widetilde{{v}}_{jp}(t)=1$ . Therefore, it follows from (16) and Lemma 5.1 that

\displaystyle\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t+1)=\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)+\frac{1}{T},\quad j\in\mathcal{A}.

(25)

Using (25), we can also write

\displaystyle\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)=\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t-d(\mathcal{G}))+\frac{1}{T}d(\mathcal{G}),\quad j\in\mathcal{A}.

(26)

Furthermore, it follows from Lemma (5.1) that for all $\forall p\in\mathcal{P}_{j}$ and any $i\in\mathcal{A}\backslash\{j\}$ we can write

\displaystyle x_{jjp}(t)\geq\hat{x}_{ijp}(t).

(27)

Also, since every agent $i\in\mathcal{A}\backslash\{j\}$ can be reached from agent $j\in\mathcal{A}$ at most in $d(\mathcal{G})$ hops, it follows from the propagation and update laws (16) and (17), for all $\forall p\in\mathcal{P}_{j}$ , for any $i\in\mathcal{A}\backslash\{j\}$ that

\displaystyle\hat{x}_{ijp}(t)\geq x_{jjp}(t-d(\mathcal{G})).

(28)

Thus, for any $j\in\mathcal{A}$ and $i\in\mathcal{A}\backslash\{j\}$ , (27) and (28) result in

\displaystyle\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)\geq\boldsymbol{\mathbf{1}}.\hat{\boldsymbol{\mathbf{x}}}_{ij}(t)\geq\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t-d(\mathcal{G})).

(29)

Next, we can use (26) and (29) to write

\displaystyle\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)\geq\boldsymbol{\mathbf{1}}.\hat{\boldsymbol{\mathbf{x}}}_{ij}(t)\geq\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)-\frac{1}{T}d(\mathcal{G}),

(30)

for $j\in\mathcal{A}$ and $i\in\mathcal{A}\backslash\{j\}$ . Using (30) for any $i\in\mathcal{A}$ we can write

\displaystyle\sum_{j\in\mathcal{A}}\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)\geq\boldsymbol{\mathbf{x}}_{ii}(t)+\sum_{j\in\mathcal{A}\backslash\{i\}}\boldsymbol{\mathbf{1}}.\hat{\boldsymbol{\mathbf{x}}}_{ij}(t)\geq\sum_{j\in\mathcal{A}}\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{jj}(t)-\frac{1}{T}Nd(\mathcal{G}).

(31)

Then, using Lemma 5.1, from (31) we can write

\displaystyle\boldsymbol{\mathbf{1}}.\bar{\boldsymbol{\mathbf{x}}}(t)\geq\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{i}(t)\geq\boldsymbol{\mathbf{1}}.\bar{\boldsymbol{\mathbf{x}}}(t)-\frac{1}{T}Nd(\mathcal{G}),

which ascertains (19a). Next, note that from Lemma 5.1, we have $\boldsymbol{\mathbf{x}}_{jj}(t)=\boldsymbol{\mathbf{x}}_{jj}^{-}(t)$ for any $j\in\mathcal{A}$ . Then, using (16) and invoking Lemma 5.1, we obtain (19b),which, given (25), also ascertains (19c). $\Box$

Proof of Theorem 5.1

Knowing that $\left|\frac{\partial^{2}F}{\partial x_{p}\partial x_{q}}\right|\leq f(\mathcal{S}^{\star})$ from Lemma 7.2 and (19c), it follows from Lemma 7.3 that

\displaystyle F(\bar{\boldsymbol{\mathbf{x}}}(t+1))-F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t)).(\bar{\boldsymbol{\mathbf{x}}}(t+1)-\bar{\boldsymbol{\mathbf{x}}}(t))-\frac{1}{2}N^{2}\frac{1}{T^{2}}f(\mathcal{S}^{\star}),

which, given (19b), leads to

\displaystyle F(\bar{\boldsymbol{\mathbf{x}}}(t+1))-F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\frac{1}{T}\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))-\frac{1}{2}N^{2}\frac{1}{T^{2}}f(\mathcal{S}^{\star}).

(32)

Next, we note that by definition, $\bar{\boldsymbol{\mathbf{x}}}(t)\geq\boldsymbol{\mathbf{x}}_{i}(t)$ for any $\forall i\in\mathcal{A}$ . Therefore, given (19a), by invoking Lemma 7.3, for any $i\in\mathcal{A}$ we can write

\displaystyle\left|\frac{\partial F}{\partial x_{p}}(\bar{\boldsymbol{\mathbf{x}}}(t))-\frac{\partial F}{\partial x_{p}}({\boldsymbol{\mathbf{x}}}_{i}(t))\right|\leq N\frac{1}{T}d(\mathcal{G}))f(\mathcal{S}^{\star}),

(33)

for $p\in\{1,\cdots,n\}$ . Recall that at each time step $t$ , the realization of $\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)$ in (12) that Algorithm 2 uses is

\displaystyle\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)=\boldsymbol{\mathbf{1}}_{p^{\star}},\qquad p^{\star}\in\mathcal{P}_{i}~{}~{}\text{is given by}~{}\eqref{eq::p_star}

(34)

for every $i\in\mathcal{A}$ . Thus, $\boldsymbol{\mathbf{1}}.\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t)=1$ , $i\in\mathcal{A}$ . Consequently, using (33) we can write

\displaystyle\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))-N^{2}\frac{1}{T}d(\mathcal{G}))f(\mathcal{S}^{\star}).

(35)

Next, we let

\bar{\boldsymbol{\mathbf{v}}}_{i}(t)=\underset{\boldsymbol{\mathbf{v}}\in P_{i}(\mathcal{M})}{\textup{argmax}}\boldsymbol{\mathbf{v}}.\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))

and

\hat{\boldsymbol{\mathbf{v}}}_{i}(t)=\underset{\boldsymbol{\mathbf{v}}\in P_{i}(\mathcal{M})}{\textup{argmax}}\boldsymbol{\mathbf{v}}.\nabla F(\boldsymbol{\mathbf{x}}_{i}(t)).

Because $f$ is monotone increasing, by virtue of Lemma 3.1, $\frac{\partial F}{\partial x_{p}}\geq 0$ , and as such $\bar{\boldsymbol{\mathbf{v}}}_{i}(t)=\boldsymbol{\mathbf{1}}_{\bar{p}}$ and $\hat{\boldsymbol{\mathbf{v}}}_{i}(t)=\boldsymbol{\mathbf{1}}_{\hat{p}}$ , where $\bar{p}=\underset{p\in\mathcal{P}_{i}}{\arg\max}\frac{\partial F(\bar{\boldsymbol{\mathbf{x}}}(t))}{\partial x_{p}}$ and $\hat{p}=\underset{p\in\mathcal{P}_{i}}{\arg\max}\frac{\partial F(\boldsymbol{\mathbf{x}}_{i}(t))}{\partial x_{p}}$ . Therefore, using

\hat{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))\geq\bar{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))

and

\hat{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))\geq\tilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t)),

$i\in\mathcal{A}$ , and (33) we can also write


	$\displaystyle\sum_{i\in\mathcal{A}}\hat{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))\geq\sum_{i\in\mathcal{A}}\bar{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))\geq\sum_{i\in\mathcal{A}}\bar{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))-N^{2}\frac{1}{T}d(\mathcal{G})f(\mathcal{S}^{\star}),$		(36a)
	$\displaystyle\sum_{i\in\mathcal{A}}\hat{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))\geq\sum_{i\in\mathcal{A}}\tilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t)).$		(36b)

On the other hand, by virtue of Lemma 7.4, $\frac{\widetilde{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}}_{j}(t))$ , $p\in\mathcal{P}_{j}$ that each agent $j\in\mathcal{A}$ uses to solve optimization problem (20) (equivalently (12)) satisfies

\displaystyle\left|\frac{\widetilde{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}}_{j}(t))-\frac{{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}}_{j}(t))\right|\leq\frac{1}{2T}f(\mathcal{S}^{\star})

(37)

with the probability of $1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}}$ . Using (36b) and (37), and also that the samples are drawn independently


		$\displaystyle\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))\geq\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))-N\frac{1}{2T}f(\mathcal{S}^{\star}),$		(38a)
		$\displaystyle\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))\geq\sum_{i\in\mathcal{A}}\hat{\boldsymbol{\mathbf{v}}}_{i}(t).\widetilde{\nabla F}(\boldsymbol{\mathbf{x}}_{i}(t))\geq\qquad\sum_{i\in\mathcal{A}}\hat{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\boldsymbol{\mathbf{x}}_{i}(t))-N\frac{1}{2T}f(\mathcal{S}^{\star}),$		(38b)

with the probability of $\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}$ .

From (35), (36a),(38a), and (38b) now we can write

\displaystyle\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\sum_{i\in\mathcal{A}}\bar{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))-(2Nd(\mathcal{G}))+1)N\frac{1}{T}f(\mathcal{S}^{\star}),

(39)

with the probability of $1-2\,\sum_{i\in\mathcal{A}}\textup{e}^{-\frac{1}{8T^{2}}K_{i}}$ .

Next, let $\boldsymbol{\mathbf{v}}_{i}^{\star}$ be the projection of $\boldsymbol{\mathbf{1}}_{\mathcal{S}^{\star}}$ into $P_{i}(\mathcal{M})$ . Knowing that $P_{i}(\mathcal{M})$ s are disjoint sub-spaces of $P(\mathcal{M})$ covering the whole space then we can write

\displaystyle\boldsymbol{\mathbf{1}}_{\mathcal{S}^{\star}}=\sum_{i\in\mathcal{A}}\boldsymbol{\mathbf{v}}_{i}^{\star}.

(40)

Then, using (39), (40), and invoking Lemma 7.1 and the fact that $\bar{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\boldsymbol{\mathbf{v}}^{\star}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))$ we obtain

		$\displaystyle\sum_{i\in\mathcal{A}}\widetilde{\boldsymbol{\mathbf{v}}}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\sum_{i\in\mathcal{A}}\boldsymbol{\mathbf{v}}^{\star}_{i}(t).\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))-(2Nd(\mathcal{G}))+1)N\frac{1}{T}f(\mathcal{S}^{\star})=$
		$\displaystyle\boldsymbol{\mathbf{1}}_{\mathcal{S}^{\star}}.\nabla F(\bar{\boldsymbol{\mathbf{x}}}(t))-(2Nd(\mathcal{G}))+1)N\frac{1}{T}f(\mathcal{S}^{\star})\geq f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(t))-(2Nd(\mathcal{G}))+1)\frac{N}{T}f(\mathcal{S}^{\star}),$		(41)

with the probability of $\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}$ . Hence, using (32) and (Appendix A: Proof of the Results in Section 5), we conclude that

\displaystyle F(\bar{\boldsymbol{\mathbf{x}}}(t+1))-F(\bar{\boldsymbol{\mathbf{x}}}(t))\geq\frac{1}{T}(f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(t))-(2Nd(\mathcal{G}))+\frac{1}{2}N+1)\frac{N}{T^{2}}f(\mathcal{S}^{\star}),

(42)

with the probability of $\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}$ .

Next, let $g(t)=f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(t))$ and $\beta=(2Nd(\mathcal{G}))+\frac{1}{2}N+1)\frac{N}{T^{2}}f(\mathcal{S}^{\star})$ , to rewrite (42) as

		$\displaystyle(f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(t)))-(f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(t+1)))=$
		$\displaystyle g(t)-g(t+1)\geq\frac{1}{T}(f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(t)))-\beta=\frac{1}{T}g(t)-\beta.$		(43)

Then from inequality (Appendix A: Proof of the Results in Section 5) we get

\displaystyle g(t+1)\leq(1-\frac{1}{T})g(t)+\beta

(44)

with the probability of $\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}$ . Solving for inequality (44) at time $T$ yields

\displaystyle g(T)\leq(1-\frac{1}{T})^{T}g(0)+\beta\sum_{k=0}^{T-1}(1-\frac{1}{T})^{k}=(1-\frac{1}{T})^{T}g(0)+T\beta(1-(1-\frac{1}{T})^{T})

(45)

with the probability of $\left(\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}\right)^{T}$ . Substituting back $g(T)=f(\mathcal{S}^{\star})-F(\bar{\boldsymbol{\mathbf{x}}}(T))$ and $g(0)=f(\mathcal{S}^{\star})-F(\boldsymbol{\mathbf{x}}(0))=f(\mathcal{S}^{\star})$ , in (45) we then obtain

	$\displaystyle(1-(1-\frac{1}{T})^{T})(f(\mathcal{S}^{\star})-T\beta)=(1-(1$	$\displaystyle-\frac{1}{T})^{T})(1-(2Nd(\mathcal{G}))+\frac{1}{2}N+1)\frac{N}{T})f(\mathcal{S}^{\star})$
		$\displaystyle\leq F(\bar{\boldsymbol{\mathbf{x}}}(T)),$		(46)

with the probability of $\left(\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}\right)^{T}$ . By applying $\frac{1}{\textup{e}}\geq(1-(1-\frac{1}{T})^{T})$ , we get

\displaystyle(1\!-\!\frac{1}{\textup{e}})(1\!-\!(2Nd(\mathcal{G})\!+\!\frac{1}{2}N+1)\frac{N}{T})f(\mathcal{S}^{\star})\!\leq F(\bar{\boldsymbol{\mathbf{x}}}(T)),

(47)

with the probability of $\left(\prod_{i\in\mathcal{A}}(1-2\textup{e}^{-\frac{1}{8T^{2}}K_{j}})^{|\mathcal{P}_{i}|}\right)^{T}$ .

Given (34), from the propagation and update rules (16) and (17) and Lemma 5.1 we can conclude that $\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{ii}(T)=1$ . Furthermore by defining $\mathcal{R}_{\boldsymbol{\mathbf{x}}_{ii}(T)}$ to be a random set where each member if sampled according to $\boldsymbol{\mathbf{x}}_{ii}(T)$ and from $\mathcal{P}_{i}$ . Since $\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}_{ii}(T)=1$ , we can also define $\mathcal{T}_{\boldsymbol{\mathbf{x}}_{ii}(T)}$ to be a random set where only one policy is sampled from $\mathcal{P}_{i}$ according to $\boldsymbol{\mathbf{x}}_{ii}(T)$ , then using Lemma 7.5, we can write

	$\displaystyle F(\bar{\boldsymbol{\mathbf{x}}}(T))=\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}(T)})]=\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{11}(T)}\cup\cdots\cup\mathcal{R}_{\boldsymbol{\mathbf{x}}_{NN}(T)})]$
	$\displaystyle\leq\mathbb{E}[f(\mathcal{T}_{\boldsymbol{\mathbf{x}}_{11}(T)}\cup\mathcal{R}_{\boldsymbol{\mathbf{x}}_{22}(T)}\cup\cdots\cup\mathcal{R}_{\boldsymbol{\mathbf{x}}_{NN}(T)})]$
	$\displaystyle\leq\cdots$
	$\displaystyle\leq\mathbb{E}[f(\mathcal{T}_{\boldsymbol{\mathbf{x}}_{11}(T)}\cup\cdots\cup\mathcal{T}_{\boldsymbol{\mathbf{x}}_{NN}(T)})]$

which concludes the proof. $\Box$

Appendix B: Auxiliary Results

In what follows we let $\mathcal{S}^{\star}$ to represent the optimizer of problem (8).

Lemma 7.1

\displaystyle\boldsymbol{\mathbf{1}}_{\mathcal{S}^{\star}}.\nabla F(\boldsymbol{\mathbf{x}})\geq f(\mathcal{S}^{\star})-F(\boldsymbol{\mathbf{x}}).

Proof

see [21] for the proof. $\Box$

Lemma 7.2

\displaystyle\left|\frac{\partial^{2}F}{\partial x_{p}\partial x_{q}}\right|\leq f(\mathcal{S}^{\star}),\qquad p,q\in\mathcal{P}.

Proof

Since $p\not\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{q\}\setminus\{p\}$ , therefor by submodularity of $f$ we can write

		$\displaystyle 0\leq\Delta_{f}(p\|\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{q\}\setminus\{p\})\leq f(\{p\}),$
		$\displaystyle 0\leq\Delta_{f}(p\|\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\!\setminus\!\!\{p,q\})\leq f(\{p\}).$		(48)

Knowing that

\displaystyle\Delta_{f}(p|\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\cup\!\{q\}\!\setminus\!\{p\})\!=\!f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{p,q\})\!-\!f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{q\}\!\!\setminus\!\!\{p\}),

and

\displaystyle\Delta_{f}(p|\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\!\setminus\!\!\{p,q\})=f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{p\})\!\!\setminus\!\!\{q\})-f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\!\setminus\!\!\{p,j\}),

then definition (3) can be written as

\displaystyle\frac{\partial F}{\partial x_{p}\partial x_{q}}=\mathbb{E}[\Delta_{f}(p|\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\{q\}\!\!\setminus\!\!\{p\})\!\!-\!\!\Delta_{f}(p|\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\!\setminus\!\!\{p,q\})].

(49)

Putting (Proof) and (49) together results in

\displaystyle\left|\frac{\partial^{2}F}{\partial x_{p}\partial x_{q}}\right|\leq f(\{p\}).

where knowing that $f(\{p\})\leq f(\mathcal{S}^{\star})$ concludes the proof. $\Box$

Lemma 7.3

Consider a twice differentiable function $F(\boldsymbol{\mathbf{x}}):\mathbb{R}^{N}\to\mathbb{R}$ which satisfies $\left|\frac{\partial^{2}F}{\partial x_{p}\partial x_{q}}\right|\leq\alpha$ for any $p,q\in\{1,\cdots,N\}$ . Then for any $\boldsymbol{\mathbf{x}}_{1},\boldsymbol{\mathbf{x}}_{2}\in\mathbb{R}^{N}$ satisfying $\boldsymbol{\mathbf{x}}_{2}\geq\boldsymbol{\mathbf{x}}_{1}$ and $\boldsymbol{\mathbf{1}}.({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})\leq\beta$ the following holds,


	$\displaystyle\left\|\frac{\partial F}{\partial x_{p}}({\boldsymbol{\mathbf{x}}}_{1}+\epsilon({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1}))-\frac{\partial F}{\partial x_{p}}({\boldsymbol{\mathbf{x}}}_{1})\right\|\leq\epsilon\alpha\beta,$		(50a)
	$\displaystyle F({\boldsymbol{\mathbf{x}}}_{2})-F({\boldsymbol{\mathbf{x}}}_{1})\geq\nabla F({\boldsymbol{\mathbf{x}}}_{1}).({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})-\frac{1}{2}\alpha\beta^{2},$		(50b)

for $\epsilon\in[0\,\,1]$ .

Proof

Let $\boldsymbol{\mathbf{h}}_{p}=[\frac{\partial^{2}F}{\partial x_{p}\partial x_{1}},\cdots,\frac{\partial^{2}F}{\partial x_{p}\partial x_{N}}]^{\top}$ . Then, we can write

		$\displaystyle\left\|\frac{\partial F}{\partial x_{p}}({\boldsymbol{\mathbf{x}}}_{1}+\epsilon({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1}))-\frac{\partial F}{\partial x_{p}}({\boldsymbol{\mathbf{x}}}_{1})\right\|=\left\|\int_{0}^{\epsilon}\boldsymbol{\mathbf{h}}_{l}({\boldsymbol{\mathbf{x}}}_{1}+\tau({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})).({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})\textup{d}\tau\right\|$
		$\displaystyle\leq\int_{0}^{\epsilon}\alpha\boldsymbol{\mathbf{1}}.({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})\textup{d}\tau=\epsilon\alpha\beta,$		(51)

Furthermore,

	$\displaystyle F({\boldsymbol{\mathbf{x}}}_{2})-F({\boldsymbol{\mathbf{x}}}_{1})=\int_{0}^{1}\nabla F({\boldsymbol{\mathbf{x}}}_{1}+\epsilon({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})).(\bar{\boldsymbol{\mathbf{x}}}(t+1)-\bar{\boldsymbol{\mathbf{x}}}(t))\textup{d}\epsilon$
	$\displaystyle\geq\int_{0}^{1}(\nabla F({\boldsymbol{\mathbf{x}}}_{1})-\epsilon\alpha\beta).({\boldsymbol{\mathbf{x}}}_{2}-{\boldsymbol{\mathbf{x}}}_{1})\textup{d}\epsilon=\nabla F({\boldsymbol{\mathbf{x}}}_{1}).({\boldsymbol{\mathbf{x}}}_{2}\!\!-\!\!{\boldsymbol{\mathbf{x}}}_{1})-\frac{1}{2}\alpha\beta^{2},$

with the third line follow from equation (Proof), which concludes the proof. $\Box$

Lemma 7.4

Consider the set value optimization problem (8). Suppose $f:\mathcal{P}\to{\mathbb{R}}_{\geq 0}$ is an increasing and submodular set function and consider its multilinear extension $F:{\mathbb{R}}_{\geq 0}^{n}\to{\mathbb{R}}_{\geq 0}$ . Let $\widetilde{\nabla F}(\boldsymbol{\mathbf{x}})$ be the estimate of $\nabla F(\boldsymbol{\mathbf{x}})$ that is calculated by taking $K\in_{>0}$ samples of set $\mathcal{R}_{\boldsymbol{\mathbf{x}}}$ according to membership probability vector $\boldsymbol{\mathbf{x}}$ . Then

\displaystyle\left|\frac{\widetilde{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}})-\frac{{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}})\right|\geq\frac{1}{2T}f(\mathcal{S}^{\star}),\quad p\in\{1,\cdots,n\},

(52)

with the probability of $2\textup{e}^{-\frac{1}{8T^{2}}K}$ , for any $T\in_{>0}$ .

Proof

Define the random variable

\displaystyle\!X\!\!=\!\!\left(\!\!(f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\cup\{p\})\!-\!f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\!\setminus\!\{p\}))\!-\!\frac{\partial F}{\partial x_{p}}(\boldsymbol{\mathbf{x}})\!\!\right)\!\!\Big{/}\!\!f(\mathcal{S}^{\star}),

and assume that agent $j\in\mathcal{A}$ takes $K_{j}$ samples from $\mathcal{R}_{\boldsymbol{\mathbf{x}}}$ to construct $\{X_{k}\}_{k=1}^{K}$ realization of $X$ . Since $f$ is a submodular function, then we have $(f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\cup\{p\})\!-\!f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\!\!\setminus\!\{p\}))\leq f(\mathcal{S}^{\star})$ . Consequently using equation (5), we conclude that $0\leq X\leq 1$ . Hence, using Theorem 3.1, we have

\displaystyle\left|\sum_{k=1}^{K}X_{k}\right|\geq\frac{1}{2T}K.

with the probability of $2\textup{e}^{-\frac{1}{8T^{2}}K}$ . Hence, the estimation accuracy of $\nabla F(\boldsymbol{\mathbf{x}})$ , is given by

\displaystyle\left|\frac{\widetilde{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}})-\frac{{\partial F}}{\partial x_{p}}(\boldsymbol{\mathbf{x}})\right|\geq\frac{1}{2T}f(\mathcal{S}^{\star}),\quad p\in\{1,\cdots,n\}.

with the probability of $2\textup{e}^{-\frac{1}{8T^{2}}K}$ . $\Box$

Lemma 7.5

Suppose $f:\mathcal{P}\to{\mathbb{R}}_{\geq 0}$ is an increasing and submodular set function and consider $\boldsymbol{\mathbf{x}}$ to be a membership probability vector of set $\mathcal{Q}\subset\mathcal{P}$ with $\boldsymbol{\mathbf{1}}.\boldsymbol{\mathbf{x}}=1$ . We define $\mathcal{R}_{\boldsymbol{\mathbf{x}}}$ to be the set resulted by sampling independently each member of $\mathcal{Q}$ according to probability vector $\boldsymbol{\mathbf{x}}$ and $\mathcal{T}_{\boldsymbol{\mathbf{x}}}=\{t\},\,\,t\in\mathcal{Q}$ to be a single member set which is chosen according to $\boldsymbol{\mathbf{x}}$ . the following holds for any random set $\mathcal{S}\in\mathcal{P}\setminus\mathcal{Q}$ .

\displaystyle\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}_{ii}}\cup S)]\leq\mathbb{E}[f(\mathcal{T}_{\boldsymbol{\mathbf{x}}_{ii}}\cup S)].

Proof

Defining $\mathcal{R}_{\boldsymbol{\mathbf{x}}}=\{r_{1},\cdots,r_{o}\}$ , then we can write

	$\displaystyle\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\mathcal{S})]=\mathbb{E}[f(\mathcal{S})+\sum_{l=1}^{o}\Delta_{f}(r_{l}\|\mathcal{S}\cup\{r_{1},\cdots,r_{l-1}\})]$
	$\displaystyle\leq\mathbb{E}[f(\mathcal{S})+\sum_{r_{l}\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}}\Delta_{f}(r_{l}\|\mathcal{S})]=\mathbb{E}_{\mathcal{S}}[\mathbb{E}_{\mathcal{R}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[f(\mathcal{S})+\sum_{r_{l}\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}}\Delta_{f}(r_{l}\|\mathcal{S})]]$
	$\displaystyle=\mathbb{E}_{\mathcal{S}}[f(\mathcal{S})+\mathbb{E}_{\mathcal{R}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[\sum_{r_{l}\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}}\Delta_{f}(r_{l}\|\mathcal{S})]]=\mathbb{E}_{\mathcal{S}}[f(\mathcal{S})+\sum_{r_{l}\in\mathcal{Q}}x_{l}\Delta_{f}(r_{l}\|\mathcal{S})]$
	$\displaystyle=\mathbb{E}_{\mathcal{S}}[f(\mathcal{S})+\mathbb{E}_{\mathcal{T}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[\Delta_{f}(t\|\mathcal{S})]]=\mathbb{E}_{\mathcal{S}}[\mathbb{E}_{\mathcal{T}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[f(\mathcal{S})+\Delta_{f}(t\|\mathcal{S})]]=\mathbb{E}[f(\mathcal{T}_{\boldsymbol{\mathbf{x}}}\cup S)]$

which concludes the proof. $\Box$

	$\displaystyle\mathbb{E}[f(\mathcal{R}_{\boldsymbol{\mathbf{x}}}\cup\mathcal{S})]=\mathbb{E}[f(\mathcal{S})+\sum_{l=1}^{o}\Delta_{f}(r_{l}\|\mathcal{S}\cup\{r_{1},\cdots,r_{l-1}\})]$
	$\displaystyle\leq\mathbb{E}[f(\mathcal{S})+\sum_{r_{l}\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}}\Delta_{f}(r_{l}\|\mathcal{S})]=\mathbb{E}_{\mathcal{S}}[\mathbb{E}_{\mathcal{R}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[f(\mathcal{S})+\sum_{r_{l}\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}}\Delta_{f}(r_{l}\|\mathcal{S})]]$
	$\displaystyle=\mathbb{E}_{\mathcal{S}}[f(\mathcal{S})+\mathbb{E}_{\mathcal{R}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[\sum_{r_{l}\in\mathcal{R}_{\boldsymbol{\mathbf{x}}}}\Delta_{f}(r_{l}\|\mathcal{S})]]=\mathbb{E}_{\mathcal{S}}[f(\mathcal{S})+\sum_{r_{l}\in\mathcal{Q}}x_{l}\Delta_{f}(r_{l}\|\mathcal{S})]$
	$\displaystyle=\mathbb{E}_{\mathcal{S}}[f(\mathcal{S})+\mathbb{E}_{\mathcal{T}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[\Delta_{f}(t\|\mathcal{S})]]=\mathbb{E}_{\mathcal{S}}[\mathbb{E}_{\mathcal{T}_{\boldsymbol{\mathbf{x}}}\|\mathcal{S}}[f(\mathcal{S})+\Delta_{f}(t\|\mathcal{S})]]=\mathbb{E}[f(\mathcal{T}_{\boldsymbol{\mathbf{x}}}\cup S)]$

Multi-Agent Maximization of a Monotone Submodular Function via Maximum Consensus

Abstract

1 Introduction

2 Notation

3 Submodular Functions

Lemma 3.1 (See [21])

Theorem 3.1 (Chernoff-Hoeffding’s inequality [26])

4 Problem Statement

5 A polynomial-time distributed multi-agent randomized continuous greedy algorithm

5.1 A short overview of the central continuous greedy process

5.2 Design and analysis of the distributed continuous greedy process

Remark 5.1 (Local computation of wpiw_{p}^{i}, p∈𝒫ip\in\mathcal{P}_{i}, by agent ii)

Lemma 5.1

Proposition 5.1

Lemma 5.2

Theorem 5.1 (Convergence guarantee and suboptimality gap of Algorithm 2)

Remark 5.2 (Extra communication for improved optimality gap)

6 Numerical Example

7 Conclusion

Acknowledgments

References

Appendix A: Proof of the Results in Section 5

Appendix B: Auxiliary Results

Lemma 7.1

Lemma 7.2

Lemma 7.3

Lemma 7.4

Lemma 7.5

Remark 5.1 (Local computation of $w_{p}^{i}$ , $p\in\mathcal{P}_{i}$ , by agent $i$ )