SEAL: Simultaneous Exploration and Localization in Multi-Robot Systems

Ehsan Latif Ramviyas Parasuraman The authors are with the Heterogeneous Robotics Research Lab, School of Computing, University of Georgia, Athens, GA 30602, USAAuthor emails: {ehsan.latif;ramviyas}@uga.edu

Abstract

The availability of accurate localization is critical for multi-robot exploration strategies; noisy or inconsistent localization causes failure in meeting exploration objectives. We aim to achieve high localization accuracy with contemporary exploration map belief and vice versa without needing global localization information. This paper proposes a novel simultaneous exploration and localization (SEAL) approach, which uses Gaussian Processes (GP)-based information fusion for maximum exploration while performing communication graph optimization for relative localization. Both these cross-dependent objectives were integrated through the Rao-Blackwellization technique. Distributed linearized convex hull optimization is used to select the next-best unexplored region for distributed exploration. SEAL outperformed cutting-edge methods on exploration and localization performance in extensive ROS-Gazebo simulations, illustrating the practicality of the approach in real-world applications.

I Introduction

The growing use of autonomous robots in various applications, including environmental monitoring, search and rescue, and mapping, has made multi-robot robotic exploration a major study field in recent years. In these scenarios, the robot must explore uncharted territory while simultaneously maintaining other criteria such as map accuracy, cost of travel, travel time, and energy savings. Information-based exploration methods have drawn particular attention because of their capacity to facilitate faster exploration and efficiently expand to 3D scenes. Many studies have examined various methods to achieve these objectives. To build a reward function and choose the best control strategies that reduce the uncertainty of maps, these techniques employ information-theoretic metrics like mutual information (MI) [1].

Refer to caption — Figure 1: Overview of the proposed SEAL for simultaneous exploration and localization in the multi-robot system, shown with the ROS Gazebo simulation, configuration for relative localization same as DGORL [2], and the exploration process with the convex hull

Entropy [3], mutual information [4], and Bayesian optimization [5] based multi-robotic exploration techniques are only a few of the many information-based exploration techniques proposed in the literature. To reduce map uncertainty when exploring new areas, these techniques improve the exploration process based on several information-theoretic measures. Specifically, reducing pose uncertainty can significantly reduce map uncertainty, and an information-based controller that takes pose uncertainty reduction into account can direct the robot to a location where it is more likely to find a loop closure, significantly improving the overall accuracy of the trajectory and the resulting map [6].

Accurate positional data is required for multi-robot exploration, which is one of the challenges posed in GPS-denied environments. Conventionally, the robot employs sensors to measure the surroundings and updates its position according to the sensors’ data. This strategy can fail if the sensor is faulty or does not provide enough information. In these circumstances, simultaneous localization and mapping (SLAM) methods can be applied [7], where the robot estimates its position in the environment and simultaneously creates a map of the surrounding area. While the existing multi-robot SLAM approaches in the literature can handle dynamic scenes very well, they might not work well in contexts with few map features or without prior global position information [8]. Furthermore, SLAM requires feature-based map merging, which is a computationally expensive procedure and causes failure during the expansion of the global map. However, we aim to overcome this limitation by only applying efficient GP fusion in parallel to the relative localization, which makes the overall system computationally efficient and provides accurate positioning and mapping information.

This paper proposes a SEAL approach incorporating information-based exploration techniques to enhance the efficiency and accuracy of robotic localization and exploration designed for multi-robot systems. To choose the next-best viewpoint for exploration, we use linearized convex hull optimization, which ensures maximum coverage of the unknown region. Our proposed method uses a Gaussian process (GP) modeling based on the fusion of multiple GPs received from connected robots and relative position estimation, further improving the posterior belief of exploration and localization. Furthermore, our method is appropriate for dynamic settings with limited features or when there is no prior knowledge and does not rely on external sensors. Fig. 1 provides an overview of the proposed method implemented in the Robot Operating System (ROS) framework.

The main contributions of SEAL are as follows:

•

We propose a new distributed exploration strategy for multi-robot systems using Gaussian process (GP) modeling over relative localization with inter-robot communication data. Here, we integrate the Rao-Blackwellization technique to improve the posterior beliefs of both exploration and localization.
•

We focus on maximizing the coverage (exploration) of an unknown environment with multiple robots using the linearized convex hull optimization method.
•

We extensively evaluate our SEAL approach’s exploration performance in realistic ROS Gazebo simulations compared against two state-of-the-art exploration approaches: Rapidly exploring Random Tree (RRT)-based exploration [9] and Deep Reinforcement Learning (DRL)-based exploration [10].
•

To further analyze the performance of localization, we compare with typical Gmapping-based SLAM (used in RRT Exploration [9]) and a relative localization method, DGORL [2], which is based on our prior work.
•

We open source¹¹1https://github.com/herolab-uga/ROS-SEAL our method as a ROS package for use and further development by the robotics community.

The SEAL’s motivation is utilizing data sharing through inter-robot communication to simultaneously localize and explore the environment. It continually improves the beliefs of exploration and localization using Rao-Blackwellization for environments where global position information is unknown. Furthermore, ensuring maximum exploration inspired by convex hull optimization [11] and its linearization addresses the problem of coverage in unknown and constrained environments. Our method does not rely on global information for localization and boundary information for exploration. SEAL achieved high localization, exploration accuracy, and efficiency by employing parallel computation of localization and exploration powered by relative localization [2], Rao-Blackwellization [12], and linearized convex hull optimization to ensure maximum area coverage.

II Related Work

Occupancy grid maps are extensively used in 2D and 3D settings because they can be swiftly queried and updated [13]. Additionally, there are several applications where the data in OGMs can be used for navigation and collision-free path planning. By utilizing these benefits, OGMs offer a new way to quantify map uncertainty in light of newly observed data [14].

Collaborative simultaneous localization and mapping (CSLAM), which is possible with a centralized system, is an alternative strategy to maximize the effectiveness of exploration. Shi et al. [15] presented an adaptive informative sampling strategy that divides the environment into various sections. This strategy, however, necessitates a substantial amount of communication and computational resources. Placed and Castellanos [16] suggested a quick autonomous robotic exploration strategy using an underlying graph structure. Their approach makes it possible to explore new settings quickly and effectively. All of these approaches use OGMs as a source of mapping and exploration.

Frontier detection algorithms are built for fast navigation using rapidly expanding random trees known as RRTs. For instance, to improve the effectiveness and efficiency of multi-robot map exploration, the aim is integrated into an optimization framework that uses RRTs in [9]. However, this strategy’s limitations are the cost of computing the optimization algorithms and the potential for non-optimal results due to the stochastic nature of RRTs.

OGMs are not the only mapping method; continuous occupancy mapping is one (COM). For instance, given the observed observations (training set) comprising free and occupied space, Gaussian processes-based occupancy maps [17] employ kernel inference methods to learn distributions across continuous occupancy. GP maps may query map points at any resolution and record the spatial correlations between sample points. They are particularly useful for extrapolating the occupancy and uncertainty of unseen places. For instance, an adaptive semantic perception strategy for quick and secure outdoor exploration was presented by Wang et al. [18]. Their approach enhances a robot’s ability to adapt to shifting settings. Recently, for robotic exploration, Xu et al. [1] put forth a confidence-rich localization and mapping strategy based on a particle filter. Their method increased the robot’s stability and localization accuracy under challenging situations.

Recent Bayesian kernel inference-based efforts, including [19], offer more computationally effective continuous occupancy mapping techniques. Jadidi et al. [20] presented a GP-based mapping and exploration technique (GPMI) to identify frontiers defined by mean occupancy from the GP map to realize COM-based information-theoretic exploration. In the extended work by Jadidi et al., [21], a high-dimensional logistic regression classifier used to produce a probabilistic frontier representation based on the uncertainty implied by GP is also noteworthy. The forward sensor model (FSM) and combined predictive distribution over the entire map are used to numerically design and compute the GPMI surfaces in the present perception field.

Another method that is gaining popularity in solving exploration and mapping difficulties is reinforcement learning (RL). A deep RL technique for autonomous graph exploration with fallible sensing capabilities was presented by Chen et al. [22]. Their technique performed well but can only be used in relatively narrow surroundings. Chen et al. [23] also proposed a zero-shot RL technique on graphs for autonomous exploration in uncertain scenarios. Although their method drastically reduced training time, it hasn’t been tested in real-world situations environmental changes may hamper robotic exploration in outdoor settings. Moreover, a Voronoi-based strategy that uses DRL for cooperative multi-robot exploration is suggested in [10]. The study aims to improve exploration efficiency by dividing the environment into Voronoi cells and assigning a robot to each cell for study. However, this approach is constrained by the difficulty of training deep RL models and the potential for less-than-ideal results because of the complexity of the environment and the unknown mapping of the environment.

The proposed SEAL approach is compared to frontier detection-based RRT [9] and DRL-based [10] exploration strategies. Both compared approaches are assumed to have accurate position information to be fully functional; robots merge locally examined maps and share them between iterations, increasing computing complexity. SEAL overcomes limitations of existing solutions of explorations that require accurate global positioning information and works on feature matching map merging techniques by providing an optimal solution.

III Problem Formulation

Problem Statement: Given that $n$ robots operating in an uncharted region are wirelessly connected, using shared exploration grids and relative location information, robots should efficiently and accurately perform simultaneous exploration and localization.

Notations: Let $r_{i}$ be the concerning robot, $n$ be the total number of robots deployed in the unknown environment, and $X_{i}$ be the robot’s position. We only consider $x,y$ coordinates of robots from the frame of reference of $r_{i}$ , which is sufficient to update the exploration grid. Let $z_{i,j}$ be the scanned observation between robots $i$ and $j$ and $g_{i}$ be the exploration grid for robot $i$ . Additionally, $w_{i}$ is the robot’s weight at $i^{th}$ exploration grid $g_{i}$ , $s^{i}_{j}$ is the observed Wi-Fi Radio Signal Strength Indicator (RSSI) for connected robot $j$ , and $C_{i}$ is the convex hull of the robot’s postures.

The difficulty lies in figuring out the robots’ positions while minimizing the sensing gap between them and building an accurate grid map of the bounding box. The problem is best described as follows:

\min_{X_{1:n}}\sum_{i=1}^{n}\sum_{j\in\mathcal{N}_{i}}\text{dist}(X_{i},X_{j})

(1)

Each robot $r_{i}$ forms a 2-D Pose Estimation Graph (PEG) $G_{i}=(V_{i},E_{i})$ using the Received Signal Strength Indicator (RSSI) from connected robots. $\tilde{X_{i}}$ is the position estimation of the objective function:

\min_{\tilde{X_{1:n}}}\sum_{i=1}^{n}\sum_{j\in\mathcal{N}_{i}}(\tilde{X_{i}}-X_{j}-z_{i,j})^{2}

(2)

Robot $r_{i}$ forms an exploration grid considering uniform distributions over the workspace and receives observations $z_{i,j}$ from the connected robots at each position and updates its global grid as $g_{global}=\sum_{i=1}^{n}w_{i}g_{i}$ . Further, $\tilde{X_{i}}$ is updated using Rao-Blackwellization [24] for the updated belief state:

p(\tilde{X_{i}},g_{i}|z_{i,j})=p(\tilde{X_{i}}|z_{i,j})p(g_{i}|\tilde{X_{i}})

(3)

Lastly, the convex hull $C_{i}$ is updated using linearized convex hull optimization for the following objective :

\min_{x_{1:n}}\sum_{i=1}^{n}\sum_{j\in\mathcal{N}_{i}}\text{dist}(\tilde{X_{i}},\tilde{X_{j}})

(4)

As the above convex hull objective function is non-linear, it can be linearized as:

minimize

\displaystyle\quad f(\tilde{X_{i}})\ \text{subject to}

\displaystyle\quad\tilde{X_{i}}\in X,

(5)

Where $f(\tilde{X_{i}})$ is the global objective function, $\tilde{X_{i}}$ is the robots’ positions in the region of low wireless vicinity, and $X$ is the set of feasible positions.

f(\tilde{X_{i}})\approx f(\tilde{X_{i}})+\nabla f(\tilde{X_{i}})^{T}(\tilde{X}-\tilde{X_{i}})

(6)

$\nabla f(\tilde{X_{i}})$ is the gradient of the objective function evaluated at $\tilde{X_{i}}$ .

To maximize the coverage, the linearized objective function can be minimized to estimate the following position to navigate for $r_{i}$ . Robots then position themselves to cover the closed region in the bounded area efficiently. Standard barrier certificates and navigation stacks are utilized for safe autonomous navigation. The system architecture of SEAL for the mentioned problem can be visualized in Fig. 2.

1 Required: previous position

X_{t-1}

, local GP

g

, received GPs, RSSI from connected robots

S

, number of connected robots

n

;

2 do in parallel

g

= Local predicted GP grid;

4 for $g_{j}\in GPs$ from connected robots do

5 Merge received

g_{j}

into local

g

;

b_{j}^{g}

= update exploration belief of

g

as Eq. 8;

8 end for

9 for $j\in$ connected robots do

10 Apply relative localization using DGORL [2];

bel_{j}(\tilde{X})

= update position belief;

13 end for

15 end

17Apply Rao-Blackwellization to update exploration as well as position belief;

18 Update position belief using GP belief as Eq. 12;

19 Update GP belief by incorporating Entropy

H(bel({X}))

from Eq. 14;

20 Build occupancy grid map based on updated belief from GP;

C

= received linearized convex hull using Alg. 2;

Navigate the robot to the unexplored region based on

C

for max exploration.

Algorithm 1 Distributed SEAL

IV Approach and Algorithms

SEAL is composed of four major parts:

1.

Global Exploration over Distributed Gaussian Process (GP) Modeling [25].
2.

Probabilistic Extension of Relative localization using DGORL [2].
3.

Integration of relative localization with exploration to update belief state using modified Rao-Blackwellization.
4.

Multi-robot distribution for full coverage using linearized convex hull optimization.

The method proposed in this work leverages the strengths of non-parametric generalization of GP for individual robot exploration, polynomial time graph optimization for relative localization, probabilistic efficiency of Rao-Blackwellization to update belief state, and linearized convex hull optimization for complete coverage without performing Voronoi partitioning. Our proposed SEAL is a merger of relative location with exploration, along with the linearized extension of the convex hull algorithm that can leverage multi-robot teams to cover the bounded region efficiently and comprehensively.

Furthermore, SEAL relies only on shared data between robots for position estimation and exploration, without considering any centralized information sources. Alg. 1 elaborates the distributed approach of SEA utilizing GPs and relative localization belief information to generate a map and determine the unexplored region for the robot to navigate to using linearized convex hull optimization.

IV-A Gaussian Process for Exploration

GP regression is a method that is frequently used to model spatial phenomena. Simulating the hidden mapping from training data to the target phenomenon is possible while considering the uncertainty provided by the natural non-parametric generalization of linear regression. Assume that the target phenomenon, a wireless source, satisfies a multivariate joint Gaussian distribution in this case. The Gaussian probability distribution of the phenomena $\omega(g)$ as given by the mean function $\mu(g)=\varepsilon[(g)]$ and covariance function $\delta^{2}(g,g^{\prime})=\varepsilon[(\omega(g)-\mu(g))^{T}\times(\omega(g)-\mu(g)]$ is the output of the trained GP model using training data.

Formally, let $GP^{[i]}=[g^{[i]}_{1},...,g^{[i]}_{k}]^{T}$ be the grid positions at which $r_{i}$ observed $k$ noisy RSSI readings [26] from the source $S^{[i]}=[s^{[i]}_{1},...,s^{[i]}_{k}]^{T}$ . Since the mean function is assumed to be zero without losing generality, each observation is noisy, with the formula $y=\omega(g)+\epsilon$ . To achieve this, given a sampling location $g_{k}\in GP$ , we have the conditional posterior means $\mu_{g_{k}|GP_{j},y_{j}}$ and variance $\delta^{2}_{g_{k}|GP_{j},y_{j}}$ as follows from the learned GP model defining the Gaussian distribution of $\omega(g_{k})\sim N(\mu_{g_{k}|GP_{j},y_{j}},\delta^{2}_{g_{k}|GP_{j},y_{j}})$ for sample position $k$ .

Once each robot learns its GP model for its current position, they share GPs with the connected robots for global exploration. Since each local GP is assumed to be learned by each robot, we assume that the underlying data distribution to be explored can be characterized by a combination of $n$ fused GPs, $GP=\{GP_{1},...,GP_{n}\}$ received from connected robots. We define $p(g\mid s_{1:n},X_{1:n})=b_{n}^{g}$ as the likelihood that, for each point in the environment, $g\in GP$ is best characterized by the GP that the $r_{i}$ has learned. Then, we produce a combination of GP models weighted by the $b_{n}^{g}$ at any grid $g\in GP$ as a linear combination of ${GP_{1},...,GP_{n}}$ . The means $\mu^{\prime}_{g|GP,Y}$ and variance $\delta^{\prime 2}_{g|GP,Y}$ for robot $i$ are the parameters that define the model as:

		$\displaystyle\mu^{\prime[i]}_{g\mid GP,Y}=\sum\limits_{j=1}^{n}b_{j}^{g^{[i]}}\times{\mu^{[i]}_{g\mid{g_{j}},{{y}_{j}}}}$		(7)
		$\displaystyle\delta^{\prime 2^{[i]}}_{g\mid GP,Y}=\sum\limits_{j=1}^{n}b_{j}^{g^{[i]}}\times\left({\delta_{g\mid{g_{j}},{{y}_{j}}}^{2^{[i]}}+{{\left({{\mu^{[i]}_{g\mid{g_{j}},{y_{j}}}}-\mu^{\prime[i]}_{g\mid GP,Y}}\right)}^{2}}}\right)$		(7)

In this problem, the algorithm computes the weight distribution in the expectation step (E-Step). Then, the maximization step updates the parameters of the local fused GP with the estimated weight distribution (M-Step), the same as discussed in GPMix [15]. Finally, the weight distribution is set to the following values before the first iteration for each random wireless sample $s$ at grid position $g$ :

O_{b^{g}}\approx\left\{{\begin{array}[]{ll}1&{{\text{ if }}{g}\in{GP^{[i]}}},b^{g}\geq\theta\\ 0&{{\text{ Otherwise }}}\end{array}\quad\forall j=1,\ldots,n}\right.\

(8)

Where $\theta$ is the threshold for exploration and $O_{b^{g}}$ is the value of grid position in the exploration grid map.

IV-B Relative Localization

Assume at a given time $t$ , a team of robots considered as vertices of the graph contains $n\in\mathbb{N}$ connected robots can form a weighted undirected graph, denoted by $G=(V,E,A)$ , of order $n$ consists of a vertex set $V=\{v_{1},...,v_{n}\}$ , an undirected edge set $E\in V\times V$ is a range between connected robots and an adjacency matrix $A=\{a_{ij}\}_{n\times n}$ with non-negative element $a_{ij}>0$ , if $(v_{i},v_{j})\in E$ and $a_{ij}=0$ otherwise. An undirected edge $e_{ij}$ in the graph $G$ is denoted by the unordered pair of robots $(v_{i},v_{j})$ , which means that robots $v_{i}$ and $v_{j}$ can exchange information with each other.

Here, we only consider the undirected graphs, indicating that the robots’ communications are all bidirectional. Then, the connection weight between robots $v_{i}$ and $v_{j}$ in graph $G$ satisfies $a_{ij}=a_{ji}>0$ if they are connected; otherwise, $a_{ij}=a_{ji}=0$ . Without loss of generality, it is noted that $a_{ii}=0$ indicates no self-connection in the graph. The degree of robot $v_{i}$ is defined by $d(v_{i})=\sum_{j=1}^{n}a_{ij}$ where $j\neq i$ and $i=1,2,...,n$ . The Laplacian matrix of the graph $G$ is defined as $L_{n}=D-A$ , where $D$ is the diagonal with $D=diag\{d(v_{1}),d(v_{2}),...,d(v_{n})\}$ . If the graph $G$ is undirected, $L_{n}$ is symmetric and positive semi-definite. A path between robots $v_{i}$ and $v_{j}$ in a graph $G$ is a sequence of edges $\{(v_{i},v_{i1}),(v_{i1},v_{i2}),...,(v_{in-1},v_{in}),(v_{in},v_{j})\}$ in the graph with distinct robots $v_{in}\in V$ . An undirected graph $G$ is connected if a path exists between any pair of distinct robots $v_{i}$ and $v_{j}$ where $(i,j=1,...,n)$ .

We begin by creating the Estimated Relative Position Measurement Graph (ERPMG): $G_{E}=(V_{E},E_{E},A_{E})$ , based on the Relative Position Estimation Graph (RPMG): $G=(V,E,A)$ , which can be constructed using range information received from connected robots of an n-robot system and describes the relative position measurements among robots. Based on robotic motion constraints, we extend RPMG to accommodate all possible robot positions in the succeeding time step.

Since each robot also gets RSSI from other robots, we may map the predicted relative positions for each robot using the model and identify the $k$ soft maximum out of them by locating the intersection region as an area of interest. Once we have $k$ possible positions of each robot, we can generate $<n^{k}$ solvable graphs for optimization.

We consider a network with $n$ robots for optimization of expanded graphs, labeled by $V=\{1,2,...,n\}$ and $k$ possible connections to other robots. Every robot $i$ has a local convex objective function and a global constraint set. The network cost function is given by:

	$\displaystyle\mathrm{minimize}~{}~{}$	$\displaystyle f(\mathbf{x})=\sum\limits_{i=1}^{N}f_{i}(\mathbf{x})$
	$\displaystyle\mathrm{subject~{}to}~{}~{}$	$\displaystyle\mathbf{x}\in\mathcal{D}=\left\{{\mathbf{x}\in\mathbf{R}^{k}~{}:~{}c(\mathbf{x})\leq 0}\right\}$

Here, $\mathbf{x}\in R^{k}$ is a global decision vector; $f_{i}:R^{k}\to R$ is the convex objective function of robot $i$ known only by robot $i$ ; $D$ is a bounded convex domain, which is (without loss of generality) characterized by an inequality constraint, i.e., $c(\mathbf{x})\leq 0$ , where $c:R^{k}\to R$ is a convex constraint function known by all robots. In addition, to estimate the relative positioning of connected robots, we also calculate the probability of estimation $bel_{j}(\tilde{X})=p(\tilde{X}_{j}|\tilde{X}_{i},s_{i,j})$ for each relative position, where $\tilde{X}$ corresponds to the position estimation by $\mathbf{x}$ , which will be used in the improvisation of the exploration grid. Technical and theoretical details of the proposed relative localization technique are explained in our previous work DGORL [2].

IV-C Rao-Blackwellization for SEAL

In the Rao-Blackwellization update procedure, according to [12], the proposal distribution is suboptimal, especially when the proprioceptive odometer measurements are less accurate than the position estimates through the relative location as mentioned in Section IV-B. Instead, we can use the persistent exploration grid map belief in Gaussian Process as discussed in Section IV-A to improve the localization accuracy and vice versa. Furthermore, in a Bayesian framework, we explicitly incorporate the exploration grid map belief into the measurement likelihood function:

		$\displaystyle bel_{j}^{[i]}(\tilde{X})$		(9)
		$\displaystyle=\psi\overline{bel}_{j}^{[i]}(\tilde{X})\int_{g}p(s_{j}\|g,\tilde{X}_{1:j},s_{1:j-1})^{[i]}p(g\|\tilde{X}_{1:j-1},s_{1:j-1})^{[i]}dg$
		$\displaystyle=\psi\overline{bel}_{j}^{[i]}(\tilde{X})\int_{g}p(s_{j}\|g,\tilde{X}_{1:j},s_{1:j-1})^{[i]}b_{j-1}^{{g}^{[i]}}dg$

According to the definition of posterior map belief, $b_{j-1}^{g^{[i]}}$ is a sufficient statistic for all previous positions $\tilde{X}_{1:j-1}$ and RSSI $s_{1:j-1}$ , and $\psi$ is the belief update constant, the following expression holds:

p(s_{j}|g,\tilde{X}_{1:j},s_{1:j-1})^{[i]}\approx p(s_{j}|g,\tilde{X}_{j},b_{j-1}^{g})^{[i]}

(10)

Similarly, the belief $b_{j}^{g^{[i]}}$ is also a sufficient statistic for the current estimated position $\tilde{X}_{j}$ and the previous map belief $b_{j-1}^{g^{[i]}}$ , so we can get:

p(s_{j}|g,\tilde{X}_{j},b_{j-1}^{g})^{[i]}\approx p(s_{j}|b_{j}^{g})^{[i]}

(11)

Thus Eq. 9 can be written as:

bel_{j}^{[i]}(\tilde{X})\propto\overline{bel}_{j}^{[i]}(\tilde{X})\int_{g}p(s_{j}|b_{j}^{g})^{[i]}b_{j-1}^{g^{[i]}}dg

(12)

Now the weight of sample $j$ possesses the sample position for robot $i$ can be defined as:

w_{j}^{[i]}=\int_{g}p(s_{j}|b_{j}^{g^{[i]}})b_{j-1}^{g^{[i]}}dg.

(13)

Furthermore, for a sample approximating the position belief, a straightforward method to estimate pose uncertainty is to use all normalized weights in a discretized way as:

	$\displaystyle H({bel}^{[i]}(\tilde{X}))$	$\displaystyle=-\int bel^{[i]}(\tilde{X})\log bel^{[i]}(\tilde{X})d\tilde{X}$		(14)
		$\displaystyle\approx-\sum_{j=1}^{N_{s}}\omega_{j}^{[i]}\log\omega_{j}^{[i]}$		(14)

1 C = convex hull

(O_{f}\cup O_{o})

#convex hull of observed walls and space;

C_{o}=O_{o}\in H

#identify occupied cells on convex hull contour;

L_{o}

= Probabilistic Hough Lines

(C_{o})

#search for lines on convex hull;

4 for each $l$ in $L_{o}$ do

c_{i,j}\in I_{o}

= find intersections

(l)

#identify intersecting hull lines;

7 end for

8C = convex hull

(O_{f}\cup O_{o}\cup I_{o})

#convex hull of observations and intersection points;

9 if C is linear then

10 return C;

12else

L_{C}

= Linearize (C) #Linearize convex hull;

14 return

L_{C}

;

16 end if

Algorithm 2 Distributed Linearized Convex Hull Optimization

IV-D Maximum Exploration with Distributed Linearized Convex Hull Optimization

The primary goal of linearizing convex hull optimization is to maximize the coverage of the explorable space and distribute robots to unknown areas.

The method, as shown in Alg. 2, is a heuristic approach that starts by figuring out the convex hull of the observed map. Observing wall cells in the exploration map on the convex hull is used to forecast potential unobserved areas of the area. The probabilistic Hough line transformation, which uses a maximum likelihood estimation of a line via sparsely connected points, is used to find these unobserved sections of the region. It identifies unseen connections between observed wall segments. The map’s unseen areas are then expanded along the identified wall lines.

Each junction of these wall-line projections adds a projected corner (represented by a single cell, $c_{i,j}\in I_{o}$ ) to the exploration map. Limiting the distance of predicted corners from the closest observations is necessary to prevent nearly parallel lines from predicting corner locations that are unreasonably far from the observed space. The observed areas of the map and the corner points are then combined to form a new convex hull. This new convex hull provides an initial estimation of the boundaries of the exploration space. The cells inside the hull are initially anticipated to be free, while the cells along the hull are predicted to be occupied in the exploration map.

It is possible that the anticipated boundary, which is convex, overestimates the limits of the explorable space. However, this upbeat strategy boosts the payout in hazy places, increasing the motivation for their exploration. Although extending lines suggests that most buildings have a rectangular shape, this is consistent with many areas humans have created. While external borders are considered straight, the projected boundaries are not constrained to any orientation or polygons class, allowing the prediction to adapt to non-rectangular situations.

In the case of a non-linearize convex hull, projecting robots to specific unexplored regions would be non-optimal. To linearize a non-convex hull, we first need to find a convex hull that approximates the non-convex function. This can be done using techniques such as convex relaxation or piecewise linearization. Once we have a convex hull approximation, we can use the following linearization equation:

f(C)\geq\sum_{j=1}^{n}a_{j}\nabla f_{j}(C)+b,

(15)

where $f(C)$ is the non-convex function we want to optimize, $\nabla f_{j}(C)$ are the convex functions that approximate $f(C)$ , $a_{j}$ are non-negative coefficients, and $b$ is a constant that ensures the approximation is valid.

Upkeep on the projected map is done before moving on to the whole exploration. To account for unseen depth in barriers, occupied cells, $O_{o}\cup I_{o}$ , are inflated into $I_{f}$ ; for example, when an obstacle is only visible from one side, it has no observable depth. User-specified inflation depth is determined by prior operating environment knowledge. Then, the map’s unreachable regions inside the expected perimeter are labeled as inferred obstacles.

IV-E Time Complexity

We examine the temporal complexity for each observation using Algs. 1 and 2. The map resolution $\alpha_{M}$ , the map size $n$ , the maximum size of a convex hull $C$ , the number of beams per scan $n_{z}$ , the number of measurement $\alpha_{z}$ , and the one of continuous map occupancy $\alpha_{m}$ for a map of size $m$ is used to define the complexity. We assume that the squared grid’s size is fixed, the query time is constant, and the data structure containing the OG map is identical to the GP data structure.

The proposed SEAL has two parallel processes: Gaussian process fusion for exploration belief estimation and Graph optimization for belief estimation of relative localization. Rao-Blackwellization will use both beliefs to update overall localization and map beliefs. The resulting time complexity of two parallel processes are $O(n^{2})$ and $O(\alpha^{-2}_{M}n_{z}\alpha^{-1}_{z}\alpha^{-1}_{m})$ . It is worth noting that the time complexities of standard Log-Odds for generating OG mapping from GP are $O(n_{z}logn)$ . Note that $n_{z}<n$ normally because of the limited sensing range and narrow beam angle. The time complexity of the proposed SEAL is, at worst, quadratic about the cone size and map resolution for belief update and GP computation, respectively. Importantly, the time cost of map belief update is quite similar to Log-Odds approach in a large scene because the independence of map size $n$ and the actual cone size is much smaller in cluttered environments. The time efficiency of SEAL calculation also depends on the linearization of convex hull optimization, which is $O(nlogn)$ . Hence, the overall upper bound on the time complexity per robot for the proposed SEAL will be $O(n\times m)$ , where m is the number of immediate (connected) neighbor robots.

This time complexity is significantly better than other methods, such as SLAM + RRT [9] and DRL [10], which have complexities $O\left(n\times\log(n)\right)$ and $O(n\times k)$ , respectively, where $k$ is the number of temporal states, and $m<<k$ .

TABLE I: Performance results of different approaches. ^∗ indicates the best performer. L - Localization. E - Exploration.

Parameters	SLAM + RRT	DRL	DGORL	SEAL
	[9]	[10]	[2]	(ours)
	L + E	E Only	L Only	L + E
Exploration Metrics
Mapping Time (s)	$302\pm 22$	$273\pm 33$	N/A	$\textbf{247}^{*}\pm 19$
Total distance (m)	$267\pm 14$	$198\pm 21$	N/A	$\textbf{139}^{*}\pm 11$
Explored Area (%)	$92\pm 2$	$94\pm 3$	N/A	$\textbf{97}^{*}\pm 2$
Localization and Mapping Metrics
Map SSIM	$0.69\pm 0.15$	$0.72\pm 0.12$	N/A	$\textbf{0.88}^{*}\pm 0.09$
ATE (m)	$0.5\pm 0.09$	N/A	$0.09\pm 0.04$	$\textbf{0.04}^{*}\pm 0.02$
ALE (m)	$1.3\pm 0.14$	N/A	$0.81\pm 0.5$	$\textbf{0.63}^{*}\pm 0.2$
Computing, Communication, and Efficiency Metrics
CPU (%)	$109\pm 20$	$88\pm 17$	N/A	$\textbf{82}^{*}\pm 11$
RAM (MB)	$1014\pm 22$	$1321\pm 59$	N/A	$\textbf{891}^{*}\pm 27$
Communication (MB)	$2.4\pm 0.07$	$2.7\pm 0.09$	N/A	$\textbf{1.2}^{*}\pm 0.04$

V Experimental Validation

We evaluated the SEAL approach in two situations in ROS Gazebo-modified versions of the AWS bookstore and house worlds. A laser sensor with a $180^{\circ}$ field of vision, a 5 m range, and 1500 beams per scan are installed on the robot’s wheeled platform. The robot’s maximum speeds in both directions are 0.2 $\frac{m}{s}$ and 0.8 $\frac{rad}{s}$ , respectively. Each robot in the experimentation performs SEAL and generates a map based on updated belief for the GP grid using Rao-Blackwellization (discussed in Section IV-C). Each robot computes linearized convex hull optimization for navigation, subsequently moving to the closest boundary based on the temporal convex hull.

V-A Evaluation Metrics

Exploration: The exploration approach and the method put forward by RRT [9] and DRL [10] are compared with that of SEAL in the bookstore and house worlds, respectively, since they use the same environments so that we can obtain a fair comparison. Below are the used metrics for evaluation:

1.

Mapping Time: The amount of time spent in the mapping (a measure of how effective is the multi-robot exploration);
2.

Total Distance Traveled: This term refers to the cumulative path length of all robot’s trajectory, which is necessary for a multi-robot system to explore the entire map. The entire trajectory length gives an idea of the robot’s energy usage while subtly describing its investigation’s effectiveness;
3.

Exploration Percentage: The percentage of the generated map with time elapsed;
4.

Map SSIM: Structural similarity index measure of generated maps compared with ground truth map;
5.

CPU Utilization: The maximum percentage consumption of the processor of a robot throughout the robot’s trajectory;
6.

Memory Consumption (RAM): The maximum occupied memory by the robot throughout the trajectory;
7.

Communication payload (Data Size): The maximum size of shared data by a robot.

Localization: The localization approach used RRT-Exploration [9] (which used GMapping for SLAM and Map merging to merge maps and find relative localization between robots) and DGORL [2] (which only does relative localization) are compared in our experiments. We use the below metrics for evaluation:

1.

ATE (m): The absolute trajectory error computed for the whole navigated trajectory by each approach, which measured the deviation from the ground truth;
2.

ALE (m): The absolute localization error for the predicted location of each approach.

V-B Exploration Results

Comprehensive listing of all results are provided in Table I. To reduce measurement noise, we examined the exploration performance of each strategy after several trials. There are several simulations run in a three-robot system. Fig. 3 displays the mapping results of each approach made by the three robots in the simulated area. Fig. 3 also delineates the belief maps generated by SEAL, which are not provided by typical SLAM (or map merging) algorithms or other exploration approaches.

Note that the results consider the typical mapping time, the distance traveled, and the effectiveness of the mapping. Comparison with the original map is used to gauge mapping effectiveness and reported percentages are adjusted using gazebo world dimensions. Fig. 4 delineates the reduced traveled distance and achieves a high exploration percentage. It can be seen that the proposed SEAL traveled approximately half the distance and achieved a comparable exploration percentage. Furthermore, The exploration capabilities of RRT, DRL, and SEAL are thoroughly compared in Table I. These results suggest that SEAL is a more effective method for exploration activities, producing superior outcomes.

V-C Localization Results

We also have analyzed the localization performance of SOTA RRT and our previous approach DGORL for localization and compared results with the proposed SEAL. Fig. 5 has shown the trajectory by each approach and localization error in terms of RMSE. Results show that our SEAL algorithm outperformed the RRT Exploration and DGORL, as it incorporates the map belief for location correction and achieved 25% higher localization accuracy than DGORL and 10% higher than RRT. To analyze the localization performance over iterations, Fig. 4 provides evidence for a gradual reduction in RMSE over robotic progression towards the maximum exploration. Additionally, Table I provided exact data showing the significant improvement of the absolute trajectory and localization errors by SEAL compared to the state-of-the-art approaches. Specifically, the proposed SEAL has 25% higher localization accuracy than RRT and DGORL.

VI Conclusion

This paper proposes a novel simultaneous exploration and localization (SEAL) approach for distributed multi-robot systems by integrating Gaussian Processes and graph optimization techniques. SEAL outperformed in exploration and localization performances compared to other state-of-the-art exploration and localization methods. Additionally, compared to RRT and DRL, SEAL decreased the time complexity per robot by around 35% and 29%, respectively, and achieved comparable mapping efficiency with improved localization accuracy than RRT and DGORL. SEAL offers fast and precise exploration without depending on boundary information for localization, making it especially beneficial for various robotic applications; as it does not rely on external sensors, it is appropriate for dynamic settings with few features or no prior knowledge.

References

[1] Y. Xu, R. Zheng, M. Liu, and S. Zhang, “Crmi: Confidence-rich mutual information for information-theoretic mapping,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 6434–6441, 2021.
[2] E. Latif and R. Parasuraman, “Dgorl: Distributed graph optimization based relative localization of multi-robot systems,” arXiv preprint arXiv:2210.01662, 2022.
[3] M. Botteghi, M. Khaled, B. Sirmacek, and M. Poel, “Entropy-based exploration for mobile robot navigation: a learning-based approach,” in Planning and robotics workshop, PlanRob, 2020.
[4] M. G. Jadidi, J. V. Miro, and G. Dissanayake, “Mutual information-based exploration on continuous occupancy maps,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015, pp. 6086–6092.
[5] S. Bai, J. Wang, F. Chen, and B. Englot, “Information-theoretic exploration with bayesian optimization,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2016, pp. 1816–1822.
[6] R. Valencia, J. Andrade-Cetto, R. Valencia, and J. Andrade-Cetto, “Active pose slam,” Mapping, Planning and Exploration with Pose SLAM, pp. 89–108, 2018.
[7] P. Schmuck and M. Chli, “Ccm-slam: Robust and efficient centralized collaborative monocular simultaneous localization and mapping for robotic teams,” Journal of Field Robotics, vol. 36, no. 4, pp. 763–781, 2019.
[8] E. Latif and R. Parasuraman, “Multi-robot synergistic localization in dynamic environments,” in ISR Europe 2022; 54th International Symposium on Robotics. VDE, 2022, pp. 1–8.
[9] L. Zhang, Z. Lin, J. Wang, and B. He, “Rapidly-exploring random trees multi-robot map exploration under optimization framework,” Robotics and Autonomous Systems, vol. 131, p. 103565, 2020.
[10] J. Hu, H. Niu, J. Carrasco, B. Lennox, and F. Arvin, “Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 69, no. 12, pp. 14 413–14 423, 2020.
[11] V. Pham, C. Laird, and M. El-Halwagi, “Convex hull discretization approach to the global optimization of pooling problems,” Industrial & Engineering Chemistry Research, vol. 48, no. 4, pp. 1973–1979, 2009.
[12] C. R. Rao, “Rao-blackwell theorem,” Scholarpedia, vol. 3, no. 8, p. 7039, 2008.
[13] A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard, “Octomap: An efficient probabilistic 3d mapping framework based on octrees,” Autonomous robots, vol. 34, pp. 189–206, 2013.
[14] Z. Zhang, T. Henderson, S. Karaman, and V. Sze, “Fsmi: Fast computation of shannon mutual information for information-theoretic mapping,” The International Journal of Robotics Research, vol. 39, no. 9, pp. 1155–1177, 2020.
[15] Y. Shi, N. Wang, J. Zheng, Y. Zhang, S. Yi, W. Luo, and K. Sycara, “Adaptive informative sampling with environment partitioning for heterogeneous multi-robot systems,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 11 718–11 723.
[16] J. A. Placed and J. A. Castellanos, “Fast autonomous robotic exploration using the underlying graph structure,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 6672–6679.
[17] L. P. Swiler, M. Gulian, A. L. Frankel, C. Safta, and J. D. Jakeman, “A survey of constrained gaussian process regression: Approaches and implementation challenges,” Journal of Machine Learning for Modeling and Computing, vol. 1, no. 2, 2020.
[18] Z. Wang, L. Chen, H. Chen, H. Chen, and X. Jiang, “Fast and safe exploration via adaptive semantic perception in outdoor environments,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 9445–9451.
[19] L. Gan, R. Zhang, J. W. Grizzle, R. M. Eustice, and M. Ghaffari, “Bayesian spatial kernel smoothing for scalable dense semantic mapping,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 790–797, 2020.
[20] M. G. Jadidi, J. V. Miró, R. Valencia, and J. Andrade-Cetto, “Exploration on continuous gaussian process frontier maps,” in 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2014, pp. 6077–6082.
[21] M. Ghaffari Jadidi, J. Valls Miro, and G. Dissanayake, “Gaussian processes autonomous mapping and exploration for range-sensing mobile robots,” Autonomous Robots, vol. 42, pp. 273–290, 2018.
[22] F. Chen, J. D. Martin, Y. Huang, J. Wang, and B. Englot, “Autonomous exploration under uncertainty via deep reinforcement learning on graphs,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 6140–6147.
[23] F. Chen, P. Szenher, Y. Huang, J. Wang, T. Shan, S. Bai, and B. Englot, “Zero-shot reinforcement learning on graphs for autonomous exploration under uncertainty,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 5193–5199.
[24] C. P. Robert and G. O. Roberts, “Rao-blackwellization in the mcmc era,” arXiv preprint arXiv:2101.01011, 2021.
[25] W. Luo and K. Sycara, “Adaptive sampling and online learning in multi-robot sensor coverage with mixture of gaussian processes,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 6359–6364.
[26] R. Parasuraman, T. Fabry, L. Molinari, K. Kershaw, M. Di Castro, A. Masi, and M. Ferre, “A multi-sensor rss spatial sensing-based robust stochastic optimization algorithm for enhanced wireless tethering,” Sensors, vol. 14, no. 12, pp. 23 970–24 003, 2014.