Combining Geometric and Information-Theoretic Approaches
for Multi-Robot Exploration

Aravind Preshant Premkumar, Kevin Yu, and Pratap Tokekar The authors are with the Department of Electrical & Computer Engineering, Virginia Tech, Blacksburg VA, USA. {aravindp, klyu, tokekar}@vt.eduThis material is based upon work supported by the National Science Foundation under Grant No. 1566247.This manuscript extends the work presented at [42]. The information-theoretic approach presented in Section IV-B did not appear in [42].

Abstract

We present an algorithm to explore an orthogonal polygon using a team of $p$ robots. This algorithm combines ideas from information-theoretic exploration algorithms and computational geometry based exploration algorithms. We show that the exploration time of our algorithm is competitive (as a function of $p$ ) with respect to the offline optimal exploration algorithm. The algorithm is based on a single-robot polygon exploration algorithm, a tree exploration algorithm for higher level planning and a submodular orienteering algorithm for lower level planning. We discuss how this strategy can be adapted to real-world settings to deal with noisy sensors. In addition to theoretical analysis, we investigate the performance of our algorithm through simulations for multiple robots and experiments with a single robot.

I Introduction

Exploration of unknown environments using a single robot has been a well-studied problem [44, 29]. The task can be performed faster if multiple robots are used. The challenge is to come up with an algorithm to efficiently coordinate multiple robots to explore the environment in the minimum amount of time. There have been two types of approaches towards solving the exploration problem: geometric and information-theoretic. In geometric approaches (e.g., [6]), it is typical to assume that the robots have perfect sensing. Geometric algorithms typically give global guarantees on distance traveled at the expense of restrictive assumptions about the environment and sensor models. On the other hand, information-theoretic approaches (e.g., [10]) explicitly take into account practical constraints such as noisy sensors and complex environments. However, these approaches are often greedy (e.g., frontier-based [50]) and typically do not yield any guarantees on the total time taken. In this paper, we investigate the challenges in combining information-theoretic algorithms with geometric exploration algorithms while preserving guarantees on exploration time.

We use competitive analysis [37] in order to analyze the cost of exploration. Competitive ratio of an online algorithm is defined as the worst-case ratio (over all inputs) of the cost of the online algorithm and the optimal offline algorithm cost. The optimal offline algorithm corresponds to the case when the input (i.e., the map of the environment) itself is known. The goal is to find online algorithms with small, constant competitive ratios, i.e., algorithms whose online performance is comparable to algorithms who know the input a priori.

We focus on the case of exploring unknown orthogonal polygons¹¹1An orthogonal polygon is one in which all sides are aligned with either the $X$ or the $Y$ axes. without any holes. Deng et al. [17] showed that there is an algorithm with a constant competitive ratio for exploring orthogonal 2-dimensional polygons with a single robot. We present an algorithm with constant competitive ratio for exploring with $p$ robots, when $p$ is fixed (Section III).

The analysis of this algorithm requires certain assumptions that do not necessarily hold in practice. Our second contribution is to show how to adapt this purely geometric algorithm for real-world constraints to incorporate sensing limitations and uncertainty (Section IV). We then extend this algorithm in order to improve the quality of the resulting map. We add a local planner to our algorithm that optimizes the information gain of the path taken by the robots while traversing in order to reduce the overall entropy in the map. We evaluate our algorithm through simulations and experiments on a mobile robot (Sections V-A and V-B). We begin with a discussion of the related work.

II Related Work

In this section, we present the existing work related to the exploration problem. We organize the related work into three broad categories: polygon exploration, graph exploration, and information-theoretic exploration.

II-A Polygon Exploration

The study of geometric problems that are based on visibility is a well-established field within computational geometry. The classic problems are the art gallery problem [39], watchman route problem [12], target search [22] and shortest path planning [41] in unknown environments.

Using a fixed set of positions for guarding a known $n$ -sided polygonal region, i.e., a set of points from which the entire polygon is visible, is known as the classical art gallery problem. Chvatal [14] and Fisk [21] proved that $\lfloor n/3\rfloor$ guards are always sufficient and sometimes necessary to cover a polygon of $n$ sides. The minimum number of guards required for a specific polygon may be much smaller than this upper bound. However, Schuchardt and Hecker [45] showed that finding a minimum set of guards is NP-hard, even for the special case of an orthogonal polygon.

Finding the shortest tour along which one mobile guard can see the polygon completely is the watchman route problem. Chin and Ntafos [12] showed that the watchman route can be found in polynomial time in a simple orthogonal polygon. Wang et al. [48] showed that the watchman route problem is for general environments is NP-hard and presented a $\mathcal{O}(\text{polylog}n)$ approximation for the restricted case when each viewpoint is required to see a complete polygon side.

Exploring an unknown polygon is the online watchman route problem. Bhattacharya et al. [6] and Ghosh et al. [24] approached the exploration problem with discrete vision, i.e., they assume that the robot does not have continuous visibility and has to stop at different scan points in order to sense the environment. They focus on the worst-case number of necessary scan points. Their algorithm results in a competitive ratio of $(r+1)/2$ , where $r$ is the number of reflex vertices in the polygon. For limited range of visibility, they give an algorithm where the competitive ratio in a polygon $P$ can be limited by $\lfloor\frac{8\pi}{3}+\frac{\pi R\times Perimeter(P)}{Area(P)}+\frac{(r+h+1)\pi R^{2}}{Area(P)}\rfloor$ , where $h$ is the number of holes and $R$ is the number of reflex vertices in the polygon.

Albers et al. [2] assume that the environment is modeled by a directed, strongly connected graph and give a $d^{\mathcal{O}\log d}m$ competitive algorithm where $m$ is the number edges in the graph and $d$ is the minimum number of edges that have to be added to make the graph Eulerian. The robot’s task is to visit all nodes and edges of the graph using the minimum number $R$ of edge traversals. For a simple polygon, Hoffmann et al. [26] presented an algorithm which achieves a competitive ratio of $26.5$ . For the special case of an orthogonal polygon, Deng et al. [17] presented a $\sqrt{2}$ competitive exploration strategy with one robot. We show how to extend the single robot exploration algorithm by Deng et al. [17] to the case of $p$ robots. The resulting algorithm has a competitive ratio that is a function of $p$ .

II-B Graph Exploration

The problem of visiting all the nodes in a graph in the least amount of time is known as the Traveling Salesperson Problem (TSP). Here, all nodes of the graph are known before-hand and the objective is to determine the shortest tour visiting all the nodes in the graph exactly once. Finding the optimal TSP tour for a given graph is known to be NP-hard, even for the special case where the nodes in the graph represent points on the Euclidean plane [35]. For the Euclidean version of the problem, there exist polynomial time approximation schemes [35, 4], i.e., for any $\epsilon>0$ , there exists a polynomial time algorithm which guarantees an approximation factor of $1+\epsilon$ .

In the graph exploration version of the problem, nodes are revealed in an online fashion. The objective is to minimize the total distance (or time) traveled. Fraigniaud et al. [23] presented an algorithm for exploration of trees using $p$ robots with a competitive ratio of $\mathcal{O}(p/\log p)$ and a lower bound of $\Omega(2+1/p)$ . This lower bound was improved by Dynia et al. [19] to $\Omega(\log p/\log\log p)$ . They modeled the cost as the maximal number of edges traversed by a robot and presented a $(4-2/p)$ competitive online algorithm. Higashikawa et al. [25] showed that greedy exploration strategies have an even stronger lower bound of $\Omega(p/\log p)$ and presented a $(p+\log p/1+\log p)$ competitive algorithm.

Better bounds have been achieved for restricted graphs. Dynia et al. [18] presented an algorithm that achieves faster exploration for trees restricted by a density parameter $k$ which forces a minimum depth for any subtree depending on its size. Trees embeddable in $d$ -dimensional grids can be explored with a competitive ratio of $\mathcal{O}(d^{1-1/k})$ . For 2-dimensional grids with only convex obstacles, Ortolf et al. [40] improved the competitive ratio to $\mathcal{O}(\log^{2}d)$ . Despite these strong restrictions on the graph, the same lower bound of $\Omega(\log p/\log\log p)$ holds for all trees.

We show that the problem of exploring a polygon can be formulated as a multi-robot tree exploration problem. Our algorithm yields a competitive ratio of $\frac{2(\sqrt{2}{p}+{\log p})}{1+\log p}$ where $p$ is the number of robots.

II-C Information-Theoretic Exploration

Geometric and graph-based approaches typically assume perfect sensing with no noise. In practice, however, measurements are not perfect and we have to account for measurement noise when exploring the environment. Such exploration strategies can be broadly classified into frontier-based and information-theoretic strategies [29].

Information-Theoretic exploration strategies typically use an occupancy grid to represent the maps generated from noisy and uncertain sensor measurement [20]. An occupancy grid is a uniformly-spaced grid, with a binary random variable (per grid cell) representing the probability of the cell being occupied by an obstacle. There are many existing occupancy grid mapping techniques like gmapping [47], octomap [28], RTAB-Map [34] and ORB-slam [38]. Frontier-based exploration strategies are largely greedy in nature and drive the robots to the boundary between known and unknown spaces in the map. In occupancy grids, frontiers are cells determined to be free (probability of occupancy close to zero) which are next to grid cells that have not been observed (probability of occupancy is set to be 0.5), shown in Figure 1. Variants of this strategy have been used to perform exploration of unknown 2D [50, 8, 46] and 2.5D environments [9]. We refer the reader to the comprehensive study by Holz et al. [27] for further details.

Refer to caption — Figure 1: Frontier is the boundary between known free space and unknown space.

Information-theoretic strategies seek to optimize some information measure such as entropy (uncertainty) [49, 36] in the environment, or mutual information [3] while exploring the environment. Mutual information [15] between two random variables $X$ and $Y$ is given by,

I(X;Y)=\sum_{x}\sum_{y}f(x,y)~{}\log\frac{f(x,y)}{f_{1}(x)f_{2}(y)}.

Mutual information in occupancy grid predicts how much future measurements will decrease the robot’s uncertainty associated with all grid cells. Julian et al. [30] studied the computation of mutual information for range based sensors. While these algorithms produce better maps, the planner is typically a one-step greedy algorithm or finite-horizon planners that cannot give global guarantees on the total distance traveled.

In order to overcome this, a better approach may be to combine a global planner along with a local planner. Davis et al. [16] proposed one such algorithm where the goal was to maximizes the coverage of a user-specified region while minimizing the control costs of the robot. The authors are given a start state and a goal state and minimize the probability of collision with the environment as well. Bai et al. [5] used an approach to predict mutual information using Bayesian optimization in which the long-term goal is to reduce entropy throughout the robot’s environment map, and the short-term goal is to perform the sensing action in each iteration that will maximize mutual information. Choudhury et al. [13] showed that supervised learning could be used to predict informative actions without evaluating the expected mutual information exhaustively for every possible action. Charrow et al. [10] attempted to resolve this with a heuristic that uses a global planner for a single robot to determine trajectories that maximizes the Information-Theoretic objective whilst employing a gradient-based trajectory optimization technique to locally refine the chosen trajectory such that the mutual information objective is maximized. We build on this work and present a two-level planner that maximizes information locally while giving strong global performance guarantees on the path length followed by the robots.

III Multi-Robot Geometric Exploration Algorithm

In this section, we present the details of our algorithm for exploring an orthogonal polygon without any holes, $P$ , using a team of $p$ robots. Our algorithm builds on the algorithm by Deng et al. [17] for exploring an orthogonal polygon with a single robot and extends it to the case of multiple robots using the graph exploration strategy from [25]. Our main insight is to show that the path followed by the robot using the algorithm in [17] can be used to construct a tree, denoted by $\mathcal{T}$ , in $P$ . That is, exploring the polygon is equivalent to visiting all nodes in this tree. We show that a multi-robot tree exploration algorithm from [25] can be used to explore and visit every node in this tree. Furthermore, we show that the competitive ratio of our algorithm is bounded (as a function of $p$ ).

We assume all robots start at a common location. The cost of exploration is defined as the time taken for all the robots to return to the starting location having explored the polygon. We say that a polygon $P$ is explored if all points in its interior and on the boundary were seen from at least one robot. For the purpose of the analysis, we assume that the sensor on the robot is an omnidirectional camera with infinite sensing range which returns the exact coordinates of any object in its field of view. We also assume that the robots move at unit speeds and can communicate at all times. In the next section, we show how to adapt our algorithm to realistic sensing models and evaluate it through experiments.

We introduce some terminology used in our algorithm (refer to Figure 2) before presenting the details. The sub-polygon that is visible from a point $x$ is known as the visibility polygon of $x$ and is denoted by $VP(x)$ . Some of the edges in the visibility polygon are part of the boundary of $P$ whereas others are chords in the interior of $P$ (e.g., segment $\overline{gb}$ in Figure 2). A reflex vertex of $P$ which breaks the continuity of the part of the boundary of $P$ visible from $x$ is known as a blocking vertex. Vertex $b$ in Figure 2 is a blocking vertex. Let $\overline{bc}$ be the side incident to $b$ which is (partly) visible from $x$ . The line segment perpendicular to $\overline{bc}$ drawn from $b$ till the boundary of $P$ is known as the extension of the blocking vertex $b$ for an orthogonal polygon. The extension $E_{b}$ is said to be associated with the side $\overline{ab}$ The robot must cross the extension in order to “look beyond” the blocking vertex and explore the polygon.

An extension divides $P$ into two sub-polygons. The one which contains the starting position is known as the home polygon with respect to $b$ . The other sub-polygon is known as the foreign polygon with respect to $b$ and is denoted as $FP(b)$ . The robot must cross the extension $E_{b}$ in order to see the side, $\overline{ab}$ , that is associated with it, since $\overline{ab}$ lies in the foreign polygon. If the foreign polygon of blocking vertex, $b^{\prime}$ , is completed contained in the foreign polygon of $b$ , then $b^{\prime}$ is said to be dominated by $b$ . If a blocking vertex is not dominated by any other blocking vertex, then the corresponding extension is said to be a critical extension. Deng et al. [17] showed that a path that visits every critical extension is sufficient to explore the entire polygon.

Draw the line segment starting from the robot’s position $x$ perpendicular to the extension $\overline{be}$ . The point at which these two line segments intersect ( $E_{g}$ ) is known as the extension goal corresponding to the blocking vertex $b$ .

1 Function explore()

Data: Cluster of robots,

C

, located at some node,

A

, in the tree.

2 if $A$ is marked under-exploration then

\mathcal{A}\leftarrow

children of

A

not marked as explored;

4 if $\mathcal{A}==\emptyset$ then

5 if $A==$ root of the tree then

6 Terminate exploration;

8 else

9 goto(

C

,parent(

A

);

11 end if

13 else

14 Divide

C

equally among

\mathcal{A}

;

15 When any cluster reaches a child of

A

, call explore;

17 end if

19 else

20 Mark

A

as under-exploration;

21 if blocking vertices detected from $A$ then

22 Sort in clockwise direction and add as children of

A

;

23 for $a$ and $b$ are distinct blocking vertices do

24 if $FP(a)\subseteq FP(b)$ then

25 add

a

as child of

b

;

27 else

28 if $FP(a)$ and $FP(b)$ intersect then

29 add the vertex that appears first in the clockwise order as the parent of the vertex that appears later;

30 end if

32 end if

34 end for

35 Divide

C

equally among children of

A

;

36 When any cluster reaches a child of

A

, call explore;

38 else

39 Mark

A

as explored;

40 goto(

C

,parent(

A

));

42 end if

44 end if

Algorithm 1 Multi-Robot Exploration Subroutine

The algorithm starts by creating a tree with the initial position of the robots as the root. All robots start in one cluster located at the root node. We use three labels to keep track of the status of any node in the tree: unexplored, under-exploration, and explored. Whenever a cluster of robots reach a node (using a subroutine goto), we call the subroutine shown in Algorithm 1. The subroutine goto finds the zig-zag-to-point paths between two points, as described by Deng et al. [17]. In addition, if two clusters run into each other, then they merge and travel up the tree together. Furthermore, whenever the goto subroutine is called with a parent node as the destination, the robots do not need to physically travel up to the parent. Instead, they can directly go to the next unexplored node using standard short-cutting techniques.

The root is initially marked as under-exploration. We then check to determine any blocking vertices visible from the current node. We add the extension goals corresponding to any blocking vertices visible from the current node as its children. All of the corresponding extension goals are added as children by sorting them in the clockwise direction. These new children of the current node are adjusted according to the conditions mentioned. These conditions define an ordering over the nodes, by rewiring the tree. The cluster of robots at the current node is then divided as equally as possible and sent to visit the children of the current node. If the current node doesn’t have any children, the current node is marked as explored. The cluster moves to the parent of the current node to explore any of its other children which are unexplored or under-exploration. If a node does not have any children that are unexplored or under-exploration, then that node and its sub-tree are said to be explored. The exploration is said to be completed if the root of the tree is marked as explored.

Consider an example where $P$ has been explored partially by a team of four robots as shown in Figure 3. All four robots $\{r_{1},r_{2},r_{3},r_{4}\}$ start off at the location $A$ which is added as the root of the tree as shown in Figure 4. Node $A$ is marked as under-exploration. The robots observe two blocking vertices, $BV_{1}$ and $BV_{2}$ . The extension goals $B$ and $C$ corresponding to $BV_{1}$ and $BV_{2}$ , respectively, are added as the children of $A$ in the tree. The robots split into two clusters $\{r_{1},r_{2}\}$ and $\{r_{3},r_{4}\}$ . Cluster $\{r_{1},r_{2}\}$ moves towards $B$ and cluster $\{r_{3},r_{4}\}$ moves towards $C$ . The corresponding nodes in the tree are marked as under-exploration. At $B$ , cluster $\{r_{1},r_{2}\}$ observes a new blocking vertex, $BV_{3}$ , and the corresponding extension goal $D$ is added as the child of $B$ . Since there is only one child, the cluster does not split and both robots move towards $D$ . The corresponding node, $D$ , in the tree is marked as under-exploration.

At $C$ , the cluster $\{r_{3},r_{4}\}$ observes a blocking vertex, $BV_{3}$ , and the corresponding extension goal $E$ is added as the child of $C$ . Similarly, $F$ is added as the child of $D$ and $G$ is added as the child of $E$ . At $F$ , cluster $\{r_{1},r_{2}\}$ observes two blocking vertices, the cluster splits into two clusters $\{r_{1}\}$ and $\{r_{2}\}$ . Cluster $\{r_{1}\}$ moves towards $H$ and cluster $\{r_{2}\}$ moves towards $I$ . At $G$ , cluster $\{r_{3},r_{4}\}$ observes that there are no more blocking vertices. Hence, no children are added to $G$ . $G$ is marked as explored and the algorithm checks the predecessor in the tree, $E$ . Since $E$ does not have any other children for exploration, $E$ is marked as explored as well.

Similarly, the algorithm checks its predecessor, $C$ , and marks it as explored. Now the algorithm checks $A$ , it has a child $B$ which is under-exploration. Since there is no blocking vertex, the algorithm proceeds to check $B$ . Similarly, the algorithm proceeds to check $D$ and subsequently $F$ . At $F$ , there are two blocking vertices since neither of the two clusters, $\{r_{1}\}$ or $\{r_{2}\}$ , have reached their goals. Thus, cluster splits into two $\{r_{3}\}$ and $\{r_{4}\}$ . $H$ is assigned to $\{r_{3}\}$ and $I$ is assigned to $\{r_{4}\}$ . The exploration algorithm proceeds in this manner up until the root is marked as explored.

III-A Competitive Ratio Analysis

In this section, we prove that the competitive ratio of our algorithm is bounded with respect to the offline optimal algorithm. We divide our analysis into three steps. First, we show that the paths followed by all the robots can be mapped to navigating on a tree. Next, we bound the sum of the costs of edges in the tree with respect to the offline optimal cost. Finally, we bound the cost of our algorithm with respect to the cost of the tree. The graph created by the algorithm is a tree by construction because once a node is added to the graph it is not added to the graph again.

When $p=1$ , the proposed algorithm is the same as the one given by Deng et al. [17] for orthogonal polygons without holes. They showed that the competitive ratio of this algorithm is $\sqrt{2}$ . The analysis relies on showing that the robot visits the critical extensions in the clockwise order in which their associated sides appear on the boundary of the polygon. We refer the reader to Lemma 6 in [17] for a proof and detailed exposition.

Let $C_{\text{OPT}}^{p}$ denote the time taken by the optimal algorithm for $p$ robots to explore $P$ . Let $C_{\text{RECT}}$ denote the time taken by the strategy from [17] for a single robot to explore $P$ . We have $C_{\text{RECT}}\leq\sqrt{2}C_{\text{OPT}}$ from Theorem 3 in [17] and the assumption that the robots travel with unit speeds. Let $C_{\text{EXPLORE}}^{p}$ be the time taken by the proposed algorithm. We show that the ratio, $C_{\text{EXPLORE}}^{p}/C_{\text{OPT}}$ , is constant for a given $p$ . We will show this by relating both quantities with $C_{\text{TREE}}^{p}$ which is the sum of the lengths of edges in the tree created by $p$ robots. Note that $C_{\text{TREE}}^{p}$ is not the same as $C_{\text{EXPLORE}}^{p}$ since it does not take into account the backtracking that may be involved in the robots’ paths.

Lemma 1.

$C_{\text{TREE}}^{p}\leq\sqrt{2}pC_{\text{OPT}}^{p}.$

Proof.

When a node is added to the tree (Line 20 in Algorithm 1) it is initially unmarked. When a robot reaches a leaf node, it first marks it as under-exploration (Line 15) and then immediately marks it as explored (Line 30). Furthermore, all critical extensions will appear as leaf nodes in the tree (due to the rewiring in Lines 18–25).

Let $T$ be the time taken for the last leaf node to be marked. We have $C_{\text{TREE}}^{p}\leq pT$ since $T$ involves backtracking costs along the tree, whereas $C_{\text{TREE}}^{p}$ only considers the sum of the edge costs.

From Lemma 6 in [17], we know that when $p=1$ , the robot visits all the critical extensions in the clockwise order in which their associated sides appear on the boundary of the polygon. From Theorem 2 and Lemma 4 in [17] it follows that if the critical extensions are visited in this clockwise order (and using rectilinear motion in the goto subroutine), then the cost of the tour, $C_{\text{EXPLORE}}^{1}$ , is no more than $\sqrt{2}C_{\text{OPT}}^{1}$ . When $p>1$ , one robot may not necessarily visit all the critical extensions. However, it still holds that the all critical extensions visited by a robot follow the clockwise order of their associated sides. As a result, we have $T\leq\sqrt{2}C_{\text{OPT}}^{p}$ . Thereby yielding, $C_{\text{TREE}}^{p}\leq\sqrt{2}pC_{\text{OPT}}^{p}$ . ∎

Theorem 1.

If $C_{\text{EXPLORE}}^{p}$ is the cost of exploring the polygon using the proposed strategy and $C_{\text{OPT}}^{p}$ is the cost of exploring the polygon using an optimal offline strategy, then we have,

\displaystyle\frac{C_{\text{EXPLORE}}^{p}}{C_{\text{OPT}}^{p}}\leq\frac{2(\sqrt{2}{p}+{\log p})}{1+\log p}.

(1)

Proof.

The cost of exploring a tree with a recursive depth-first strategy used in the proposed algorithms is given by:

\displaystyle C_{\text{EXPLORE}}^{p}\leq\frac{2(C_{\text{TREE}}^{p}+d_{\max}{\log p})}{1+\log p},

(2)

where $d_{\max}$ is the maximum distance of a leaf node from the root in the tree. This bound comes directly from the result in [25]. In our case, $d_{\max}$ corresponds to the maximum distance of any extension goal from the starting position of the robots. It is easy to see that $d_{\max}\leq C_{\text{OPT}}^{p}$ .

From the previous lemma we have, we have:

\displaystyle C_{\text{EXPLORE}}^{p}\leq\frac{2(\sqrt{2}{p}C_{\text{OPT}}^{p}+C_{\text{OPT}}^{p}{\log p})}{1+\log p},

which yields

\displaystyle\frac{C_{\text{EXPLORE}}^{p}}{C_{\text{OPT}}^{p}}

\displaystyle\leq\frac{2(\sqrt{2}{p}+{\log p})}{1+\log p}.

(3)

∎

Thus, we show that the competitive ratio is bounded as a function of $p$ . Figure 5 shows a plot of this bound as a function of $p$ . We note that while the analysis only holds for the case of an orthogonal polygon without holes, the resulting algorithm can also be applied for polygons with holes. However, in such a case, the underlying graph created by the robots is not guaranteed to be a tree. Consequently, we would have to use analogous algorithms for exploring general graphs with multiple robots to yield a similar competitive ratio. In the simulations, we show the empirical performance of our algorithm in environments with holes.

IV Incorporating Information-Theoretic Planning

Some of the assumptions made for the analysis do not hold in most practical scenarios. In this section, we show how to extend our basic algorithm framework in order to incorporate real-world constraints.

IV-A Finding Blocking Vertices in Occupancy Maps

The main assumption is that of unlimited sensing range. In practice, the robot has a limited sensing range. For example, the robot cannot sense a long corridor with a single observation. Thus, in addition to blocking vertices, the robot has to move to frontiers at the end of its sensing range to sense more of the environment. This increases the distance the robot would have to cover compared to the distance it would have to cover if it had an infinite sensor. The robot thus has to detect two types of frontiers, one due to blocking vertices and the other due to sensing range. The only change to the algorithm is in Line 1 where we check for blocking vertices as well as frontiers due to sensing range.

We first detect blocking vertices in a given scan (as described below). Then frontier cells are clustered together to form frontiers due to sensing range. Any frontier which has a constituent frontier cell neighboring a blocking vertex is then discarded. For a blocking vertex, the extension goal is added on its extension with a slight offset. For frontiers due to sensing range, the middle frontier cell, after clustering, is chosen as the frontier goal.

Due to the sensing uncertainty, we represent the map built by the robots as a 2D occupancy grid (OG) as opposed to a geometric map. An OG is a discretized representation of the environment where each cell represents the probability of that space being occupied. Figure 6-Right shows a representative OG. Cells with a lower probability of occupancy ( $<0.5$ ) are designated as free cells (represented as white in the OG) and cells with a higher probability of occupancy ( $>0.5$ ) are designated as occupied cells (represented as black in the OG). Cells which have not been observed are marked as unknown (represented as gray in the OG).

Typical sensors such as cameras and laser rangefinders have a finite angular resolution. Consider three rays from the laser as shown in Figure 6-Left. The rays intersect obstacles at the cells marked as black and all the cells the ray intersect between the robot (marked in blue) and the cell are marked as free. Due to the finite resolution of the laser ( $0.395^{\circ}$ for the Hokuyo laser used in our system), the gray cells, even though they are in the field of the laser, are unobserved and this leads to gaps in observations. This leads to false frontiers being detected and hence such erroneous frontiers have to be discarded. We employ a simple heuristic of checking the size of a candidate frontier and discard those below a threshold.

Consider the robot (and the laser) located at the blue circle in Figure 6-Right. The red ray represents one of the laser rays. In order to check for blocking vertices, occupied cells with a neighboring frontier cell are shortlisted first. In the figure, the cells marked with yellow and red X are identified. A blocking vertex, as defined earlier, is a reflex vertex. We can detect a reflex vertex in an OG by checking its four neighbors. If the four neighbors are an occupied cell, an unknown cell, a free cell which is a frontier, and a free cell which is not a frontier we mark it as a blocking vertex. In Figure 6-Right, the cell marked with the yellow X is detected as a blocking vertex.

IV-B Information-Theoretic Subroutine

In the analysis for Algorithm 1, we assumed that we follow the shortest paths between A and B when executing the goto(A,B) subroutine. Instead, we can add local detours that will increase the information gain in order to improve the quality of the map. In order to maintain a constant competitive ratio, the robot is given a certain budget. This is known as the orienteering problem [7]. The orienteering problem seeks to find a path of length no more than a given budget, that maximizes the reward collected along the path. The reward is a function of the set of vertices visited along the path. If the reward is modular, then there exists a constant-factor approximation for the problem [7].

Our objective is to minimize the uncertainty in the map. Mutual information [10] is a suitable reward function that can be used. Mutual information measures the expected reduction in the entropy (i.e., uncertainty) of the map based on the measurements it expects to receive along the path. Mutual information is known to be submodular and monotonically increasing [32].²²2Strictly speaking, mutual information is only monotonically increasing when the number of vertices on the path is much less than the number of vertices on the graph [32]. We use the recursive greedy algorithm for submodular orienteering by Chekuri and Pal [11] to add local detours to the path. Chekuri and Pal [11] give a quasi-polynomial time recursive greedy algorithm that yields an $\mathcal{O}(\log\text{OPT})$ approximation for this problem, where OPT is the optimal reward.

The orienteering subroutine takes as input: an input graph, $G$ , a start vertex, $s$ , a goal vertex, $t$ , a budget, $B$ , a set of nodes visited, $X$ and returns a path that maximizes a submodular reward function subject to the budget. The input graph is generated by imposing a grid (add points above and below the path) on the shortest path between $s$ and $t$ as shown in Figure 7. The start vertex is the current position of the robot, and the goal vertex is the next node in the tree to be visited. We set the budget to $B=\alpha d$ , where $d$ is the length of the shortest path from $s$ to $t$ , and $\alpha\geq 1$ is any constant. As $\alpha$ increases, we expect the quality of the map to improve at the expense of the cost of exploration (Figure 7).

Theorem 2.

Let $B=\alpha d$ be the assigned budget to the algorithm, i.e., the robot is allowed to take a path of cost $\alpha\geq 1$ times the cost of the shortest path, $d$ , between the nodes. Let $C_{\text{EXPLORE-INFO}}^{p}$ be the cost of exploring the the polygon using the proposed algorithm along with the information theoretic subroutine. We have,

\displaystyle\frac{C_{\text{EXPLORE-INFO}}^{p}}{C_{\text{OPT}}^{p}}\leq\frac{2\alpha(\sqrt{2}{p}+{\log p})}{1+\log p}

(4)

Proof.

From Theorem 1, we have:

\displaystyle\frac{C_{\text{EXPLORE}}^{p}}{C_{\text{OPT}}}

\displaystyle\leq\frac{2(\sqrt{2}{p}+{\log p})}{1+\log p}

(5)

Each $s-t$ path between nodes is allotted a budget $B$ , hence the total cost of our algorithm $C_{\text{EXPLORE}}^{p}$ is multiplied by at most a factor of $\alpha$ . Hence the total cost of the algorithm with the information theoretic subroutine is given by,

\displaystyle C_{\text{EXPLORE-INFO}}^{p}=\alpha C_{\text{EXPLORE}}^{p}

(6)

Hence, we have,

\displaystyle\frac{C_{\text{EXPLORE-INFO}}^{p}}{C_{\text{OPT}}^{p}}\leq\frac{2\alpha(\sqrt{2}{p}+{\log p})}{1+\log p}

(7)

∎

We modify the subroutine in order to save computation time by restricting the paths to be strictly moving forward, i.e, the path cannot have edges moving back towards $s$ . This reduces the computation time by an order of magnitude. This is a heuristic and therefore the approximation ratio may not necessarily hold for the heuristic.

IV-C General Environments

Our algorithm also works for environments which are not orthogonal when occupancy grids are used as the underlying representation. Occupancy grids, typically, are orthogonal polygons by construction. Furthermore, for environments with holes (i.e., obstacles) the algorithm would create a tree and explore this tree. While this is correct, the distance traveled by the robots can be much higher than the optimal cost and as such the competitive ratio does not hold.

In the next section, we evaluate the empirical performance of our algorithm through simulations in such scenarios.

V Evaluation

In this section, we report results from simulations and hardware experiments.

V-A Simulations

We implemented our algorithm using ROS [43] and carried out simulations in Gazebo [31] in order to verify the correctness of the exploration algorithm in realistic environments. The five Gazebo simulation environments used (Figure 8) are not all orthogonal and simply-connected – assumptions only required for the analysis to hold. The simulated robot is a differential-drive Pioneer P3-DX robot with a 2D Hokuyo laser range finder with a maximum range of 5 meters. The robot is localized in the environment using the amcl [1] package using a pre-defined map. Mapping during exploration is done using the octomap ROS package. The laser scan is converted into a point cloud which is then fed as input to the octomap [28] node. Our implementation is available online at https://github.com/raaslab/Exploration.

Figure 8: Simulation environments 1 – 5 from left to right.

Figure 9 shows various stages of exploration with four robots. The figure also shows the partial exploration tree built by the algorithm. The final trees produced while exploring all the environments are given in Figure 10. Table I shows the maximum distance traveled by a robot (in meters) during exploration for all the environments.

	Env. 1	Env. 2	Env. 3	Env. 4	Env. 5
1 Robot	214.11	362.29	124.77	135.62	156.03
2 Robots	121.95	223.50	76.87	82.69	74.50
4 Robots	127.78	152.18	83.39	63.51	64.36

TABLE I: Maximum distance traveled by a robot to explore environments in Fig. 8.

The cost of exploring environments 1, 3, 4, and 5 remains almost the same when the number of robots is increased from two to four. This is due to the fact that the exploration tree is not a balanced tree (Figure 10). On the other hand, in environment 2, the tree contains four or more under-exploration branches at all times. Consequently, the cost of exploration decreases significantly when four robots are used as opposed to just two.

The effect of the budget on the entropy is shown in table II. We can see that increase in budget reduces the overall entropy in the environment but at the cost of increased exploration time. We can also see that increasing the number of robots also decreases the entropy in the environment.

	Budget	Entropy	Total Time
1 Robot	1	24713	1131
1 Robot	2	24479	2719
1 Robot	4	24131	6187
2 Robots	1	24400	949
2 Robots	2	24267	3406
2 Robots	4	24039	6584

TABLE II: Effect of budget while exploring first environment in Fig. 8.

V-B Experiments

We carried out experiments using a Pioneer P3-DX robot mounted with a Hokuyo URG-04LX-UG01 2D laser and a Kinect2 RGBD sensor (Figure 11-Top). The laser was configured to use 180^∘ field of view with a resolution of 0.395^∘. During the exploration experiments, the robot used the 2D laser for localization on a pre-built map. The map was generated using RTAB-map [34, 33] and the localization was carried out using the amcl package from ROS. Note that having a pre-built map is not a requirement for our algorithm. The amcl localization component could be replaced by, for example, any SLAM implementation. Furthermore, the robots generated a new map during the exploration process using octomapping [28] with the Kinect2 sensor. This map (and not the pre-built map) was used as the basis for finding blocking vertices in the proposed exploration algorithm.

	Budget	Entropy	Total Time(s)
1 Robot	1	78493	1516
1 Robot	2	78128	2453
1 Robot	4	77917	5427

TABLE III: Effect of budget while exploring the corridor shown in Fig. 11 using the real robot

The mapping and localization were performed on the pioneer P3-DX robot with two onboard computers in a master/slave configuration. The slave i7 Intel NUC was dedicated to processing the Kinect2 data and the master i7Intel NUC was dedicated to running the localization and exploration algorithm. Figure 11 shows the results of an exploration experiment in a corridor environment along with the path followed by the robot. This proof-of-concept experiment shows that the algorithm can be applied in real world scenarios. A video of the system in operation is available online at https://github.com/raaslab/Exploration

VI Conclusion

We presented an algorithm for exploring an unknown polygonal environment using a team of $p$ robots in the least amount of time. Our main theoretical contribution was to show that if the underlying environment is an orthogonal polygon without holes then our algorithm yields a constant competitive ratio for fixed $p$ . Next, we showed how to adapt our algorithm so that it can extend to real-world sensing constraints. We then leveraged the mutual information at different locations in the map to reduce the overall entropy in the map whilst maintaining a constant competitive ratio. We verified the behavior of our algorithm through simulations and experiments.

Future work includes extending our analysis to more general environments. Handling general polygons without holes, not necessarily orthogonal, is a direct extension of the algorithm presented here. The notion of blocking vertices remains the same and the underlying graph will still be a tree. However, the extension goal corresponding to the blocking vertex needs to be carefully defined. An immediate avenue of future work is to leverage the algorithm from [17] that allows for obstacles in orthogonal environments. For polygons with holes, the underlying graph is no longer a tree. Hence, a general graph exploration algorithm would have to be used.

References

[1] AMCL ROS Packagel. http://wiki.ros.org/amcl. Accessed: 2016-10-30.
[2] Susanne Albers and Monika R Henzinger. Exploring unknown environments. SIAM Journal on Computing, 29(4):1164–1188, 2000.
[3] Francesco Amigoni and Vincenzo Caglioti. An information-based exploration strategy for environment mapping with mobile robots. Robotics and Autonomous Systems, 58(5):684–699, 2010.
[4] Sanjeev Arora. Polynomial time approximation schemes for euclidean traveling salesman and other geometric problems. Journal of the ACM (JACM), 45(5):753–782, 1998.
[5] Shi Bai, Jinkun Wang, Fanfei Chen, and Brendan Englot. Information-theoretic exploration with bayesian optimization. In Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on, pages 1816–1822. IEEE, 2016.
[6] Amitava Bhattacharya, Subir Kumar Ghosh, and Sudeep Sarkar. Exploring an unknown polygonal environment with bounded visibility. In International Conference on Computational Science, pages 640–648. Springer, 2001.
[7] Avrim Blum, Shuchi Chawla, David R Karger, Terran Lane, Adam Meyerson, and Maria Minkoff. Approximation algorithms for orienteering and discounted-reward tsp. SIAM Journal on Computing, 37(2):653–670, 2007.
[8] Wolfram Burgard, Mark Moors, Cyrill Stachniss, and Frank E Schneider. Coordinated multi-robot exploration. IEEE Transactions on robotics, 21(3):376–386, 2005.
[9] Kyle Cesare, Ryan Skeele, Soo-Hyun Yoo, Yawei Zhang, and Geoffrey Hollinger. Multi-uav exploration with limited communication and battery. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 2230–2235. IEEE, 2015.
[10] Benjamin Charrow, Gregory Kahn, Sachin Patil, Sikang Liu, Ken Goldberg, Pieter Abbeel, Nathan Michael, and Vijay Kumar. Information-theoretic planning with trajectory optimization for dense 3d mapping. In Proceedings of Robotics: Science and Systems, 2015.
[11] Chandra Chekuri and Martin Pal. A recursive greedy algorithm for walks in directed graphs. In Foundations of Computer Science, 2005. FOCS 2005. 46th Annual IEEE Symposium on, pages 245–253. IEEE, 2005.
[12] Wei-pang Chin and Simeon Ntafos. Optimum watchman routes. Information Processing Letters, 28(1):39–44, 1988.
[13] Sanjiban Choudhury, Ashish Kapoor, Gireeja Ranade, and Debadeepta Dey. Learning to gather information via imitation. arXiv preprint arXiv:1611.04180, 2016.
[14] Vasek Chvatal. A combinatorial theorem in plane geometry. Journal of Combinatorial Theory, Series B, 18(1):39–41, 1975.
[15] Thomas M Cover and Joy A Thomas. Elements of information theory. John Wiley & Sons, 2012.
[16] Bobby Davis, Ioannis Karamouzas, and Stephen J Guy. C-opt: Coverage-aware trajectory optimization under uncertainty. IEEE Robotics and Automation Letters, 1(2):1020–1027, 2016.
[17] Xiaotie Deng, Tiko Kameda, and Christos Papadimitriou. How to learn an unknown environment. i: the rectilinear case. Journal of the ACM (JACM), 45(2):215–245, 1998.
[18] Miroslaw Dynia, J Kutyłowski, F Meyer auf der Heide, and Christian Schindelhauer. Smart robot teams exploring sparse trees. In International Symposium on Mathematical Foundations of Computer Science, pages 327–338. Springer, 2006.
[19] Miroslaw Dynia, Jakub ŁopuszaŃski, and Christian Schindelhauer. Why robots need maps. In International Colloquium on Structural Information and Communication Complexity, pages 41–50. Springer, 2007.
[20] Alberto Elfes. Using occupancy grids for mobile robot perception and navigation. Computer, 22(6):46–57, 1989.
[21] Steve Fisk. A short proof of chvátal’s watchman theorem. Journal of Combinatorial Theory, Series B, 24(3):374, 1978.
[22] Rudolf Fleischer, Tom Kamphans, Rolf Klein, Elmar Langetepe, and Gerhard Trippen. Competitive online approximation of the optimal search ratio. SIAM Journal on Computing, 38(3):881–898, 2008.
[23] Pierre Fraigniaud, Leszek Gasieniec, Dariusz R Kowalski, and Andrzej Pelc. Collective tree exploration. Networks, 48(3):166–177, 2006.
[24] Subir Kumar Ghosh, Joel Wakeman Burdick, Amitava Bhattacharya, and Sudeep Sarkar. Online algorithms with discrete visibility-exploring unknown polygonal environments. IEEE robotics & automation magazine, 15(2):67–76, 2008.
[25] Yuya Higashikawa, Naoki Katoh, Stefan Langerman, and Shin-ichi Tanigawa. Online graph exploration algorithms for cycles and trees by multiple searchers. Journal of Combinatorial Optimization, 28(2):480–495, 2014.
[26] Frank Hoffmann, Christian Icking, Rolf Klein, and Klaus Kriegel. The polygon exploration problem. SIAM Journal on Computing, 31(2):577–600, 2001.
[27] Dirk Holz, Nicola Basilico, Francesco Amigoni, and Sven Behnke. A comparative evaluation of exploration strategies and heuristics to improve them. In ECMR, pages 25–30, 2011.
[28] Armin Hornung, Kai M Wurm, Maren Bennewitz, Cyrill Stachniss, and Wolfram Burgard. Octomap: An efficient probabilistic 3d mapping framework based on octrees. Autonomous Robots, 34(3):189–206, 2013.
[29] Miguel Juliá, Arturo Gil, and Oscar Reinoso. A comparison of path planning strategies for autonomous exploration and mapping of unknown environments. Autonomous Robots, 33(4):427–444, 2012.
[30] Brian J Julian, Sertac Karaman, and Daniela Rus. On mutual information-based control of range sensing robots for mapping applications. The International Journal of Robotics Research, page 0278364914526288, 2014.
[31] Nathan Koenig and Andrew Howard. Design and use paradigms for gazebo, an open-source multi-robot simulator. In Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on, volume 3, pages 2149–2154. IEEE, 2004.
[32] A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in gaussian processes: Theory, efficient algorithms and empirical studies. The Journal of Machine Learning Research, 9:235–284, 2008.
[33] Mathieu Labbe and Francois Michaud. Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Transactions on Robotics, 29(3):734–745, 2013.
[34] Mathieu Labbé and François Michaud. Online global loop closure detection for large-scale multi-session graph-based slam. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2661–2666. IEEE, 2014.
[35] Joseph SB Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric tsp, k-mst, and related problems. SIAM Journal on Computing, 28(4):1298–1309, 1999.
[36] Stewart J Moorehead, Reid Simmons, and William L Whittaker. Autonomous exploration using multiple sources of information. In Robotics and Automation, 2001. Proceedings 2001 ICRA. IEEE International Conference on, volume 3, pages 3098–3103. IEEE, 2001.
[37] Rajeev Motwani and Prabhakar Raghavan. Randomized algorithms. Chapman & Hall/CRC, 2010.
[38] Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. Orb-slam: a versatile and accurate monocular slam system. IEEE Transactions on Robotics, 31(5):1147–1163, 2015.
[39] Joseph O’Rourke. Art gallery theorems and algorithms. Oxford University Press Oxford, 1987.
[40] Christian Ortolf and Christian Schindelhauer. Online multi-robot exploration of grid graphs with rectangular obstacles. In Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, pages 27–36. ACM, 2012.
[41] Christos H Papadimitriou and Mihalis Yannakakis. Shortest paths without a map. Theoretical Computer Science, 84(1):127–150, 1991.
[42] Aravind Preshant, Kevin Yu, and Pratap Tokekar. A geometric approach for multi-robot exploration in orthogonal polygons. In Workshop on Algorithmic Foundations of Robotics (WAFR), 2016.
[43] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, and A. Ng. ROS: an open-source Robot Operating System. In ICRA Workshop on Open Source Software, 2009.
[44] Nagewara SV Rao, Srikumar Kareti, Weimin Shi, and S Sitharama Iyengar. Robot navigation in unknown terrains: Introductory survey of non-heuristic algorithms. Technical report, Citeseer, 1993.
[45] Dietmar Schuchardt and Hans-Dietrich Hecker. Two np-hard art-gallery problems for ortho-polygons. Mathematical Logic Quarterly, 41(2):261–267, 1995.
[46] Mac Schwager, Philip Dames, Daniela Rus, and Vijay Kumar. A multi-robot control policy for information gathering in the presence of unknown hazards. In Proceedings of International Symposium on Robotics Research, Aug, 2011.
[47] C Stachniss and G Grisetti. Gmapping project at openslam. org, 2007.
[48] Pengpeng Wang, Ramesh Krishnamurti, and Kamal Gupta. Generalized watchman route problem with discrete view cost. International Journal of Computational Geometry & Applications, 20(02):119–146, 2010.
[49] Peter Whaite and Frank P Ferrie. Autonomous exploration: Driven by uncertainty. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(3):193–205, 1997.
[50] Brian Yamauchi. A frontier-based approach for autonomous exploration. In Computational Intelligence in Robotics and Automation, 1997. CIRA’97., Proceedings., 1997 IEEE International Symposium on, pages 146–151. IEEE, 1997.

Combining Geometric and Information-Theoretic Approaches for Multi-Robot Exploration

Abstract

I Introduction

II Related Work

II-A Polygon Exploration

II-B Graph Exploration

II-C Information-Theoretic Exploration

III Multi-Robot Geometric Exploration Algorithm

III-A Competitive Ratio Analysis

Lemma 1.

Proof.

Theorem 1.

Proof.

IV Incorporating Information-Theoretic Planning

IV-A Finding Blocking Vertices in Occupancy Maps

IV-B Information-Theoretic Subroutine

Theorem 2.

Proof.

IV-C General Environments

V Evaluation

V-A Simulations

V-B Experiments

VI Conclusion

References

Combining Geometric and Information-Theoretic Approaches
for Multi-Robot Exploration