This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

CoBigICP: Robust and Precise Point Set Registration using Correntropy Metrics and Bidirectional Correspondence

Pengyu Yin†1, Di Wang†1, Shaoyi Du∗1, Shihui Ying2, Yue Gao∗3, and Nanning Zheng1, IEEE Fellow This work was supported by the National Key Research and Development Program of China under Grant No. 2017YFA0700800, the National Natural Science Foundation of China under Grant Nos. 61971343, 61790563 and 61627811, the Natural Science Basic Research Plan in Shaanxi Province of China under Grant No. 2020JM-012. The authors contribute equally to this paper.1 Authors are with Institute of Artificial Intelligence and Robotics, College of Artificial Intelligence, Xi’an Jiaotong University, Shaanxi 710049, P.R. China. 2 S. Ying is with Shanghai University, Shanghai 200444, P.R. China. 3 Y. Gao is with Tsinghua University, Beijing 10048, P.R. China. *Emails: {dushaoyi,kevin.gaoy}@gmail.com
Abstract

In this paper, we propose a novel probabilistic variant of iterative closest point (ICP) dubbed as CoBigICP. The method leverages both local geometrical information and global noise characteristics. Locally, the 3D structure of both target and source clouds are incorporated into the objective function through bidirectional correspondence. Globally, error metric of correntropy is introduced as noise model to resist outliers. Importantly, the close resemblance between normal-distributions transform (NDT) and correntropy is revealed. To ease the minimization step, an on-manifold parameterization of the special Euclidean group is proposed. Extensive experiments validate that CoBigICP outperforms several well-known and state-of-the-art methods.

I Introduction

Point cloud registration aims at finding the rigid transformation between two point clouds. It is an essential technique in many fields including reconstruction, medical application, and localization for mobile robots. One of the most famous and effective methods to solve the point cloud registration problem is iterative closest point (ICP) [1]. On the whole, a general workflow of ICP consisting of 1) data point filtering, 2) data association, 3) outlier filtering and 4) error minimization. Many researchers focus on the third stage to realize a robust and precise algorithm: as a large enough portion of outlier could have more impact on the error minimization step than the inlier, results in a misalignment in the final result.

The most common outlier filtering-based algorithms are Generalized-ICP (GICP) [2] and its variants. Instead of using single points for registration, these approaches embed the geometric structure into the objective function. GICP emphasizes the residual in the normal direction while neglects the co-planar residual. A well-known GICP variant is normal ICP (NICP) [3], who additionally incorporated the normal vector into the objective function with a simplification to the information matrix. Tabib et al. [4] proposed a Gaussian mixture model (GMM) based registration technique that forms a more precise weight distribution for point-pair. Parkison et al. [5] converted the solution space from Euler angle to special Euclidean and adopted Cauchy loss as the error function. Wang et al. [6] proposed a mixture of exponential power distributions to model the residual error. However, these methods lack attention to the correspondence establishment and suffer from heavy outliers. Other solutions exist that rely on thresholds [7] or continuous functions.

Refer to caption
(a) GT+CoBigICP
Refer to caption
(b) GT+GICP
Figure 1: Reconstruction results of ETH Hauptgebaude with CoBigICP and GCIP are presented. To be more illustrative, with the premise of the point clouds are put in the same coordinate, we lay the reconstruction point cloud of CoBigICP and GICP (both are colored) on the GT point cloud (white points) separately with no post-processing. We observe clearly the misalignment in the result of GICP and a nearly perfect reconstruction result with CoBigICP.

In [8], the similarity of GICP and NDT is noticed. The intuition of our work stems from the robust noise modeling that implicitly adopted in normal-distributions transform (NDT)[9], and the probabilistic model of GICP.

Contributions of this paper are as follows:

  1. 1.

    The proposed algorithm leverages bidirectional correspondences for a symmetric model.

  2. 2.

    Correntropy is employed as a robust error metric, with its efficacy being inferred and proved. We revealed the insight of the close resemblance between NDT and correntropy.

  3. 3.

    An on-manifold parameterization solution is provided.

The rest of the paper is organized as follows: firstly in section II we introduce a general framework of ICP and the basic algorithm. In section III we detail the formulation of the proposed method and also the relationship between correntropy and NDT. In section IV we present comparative experiments on different data sets. And finally in section V we draw the conclusion and discuss future works.

An comparative reconstruction result is shown in Fig. (1) and the source code is available at https://github.com/Pamphlett/CoBigICP.

II Problem Formulation

II-A Mathematical formulation of point set registration

Given two point clouds 𝒜\mathcal{A} and \mathcal{B}, point cloud registration problem aims at finding a rigid transformation that best aligns the source point cloud ={𝐛1:Ns}\mathcal{B}=\{{\bf b}_{1:N_{s}}\} to the target point cloud 𝒜={𝐚1:Nt}\mathcal{A}=\{{\bf a}_{1:N_{t}}\} [8], where 𝐚\bf a and 𝐛\bf b denote the 3D points. We define the association variable =Δ{mi,ni}i=1Ng\mathcal{I}\overset{\Delta}{=}\left\{m_{i},n_{i}\right\}_{i=1}^{N}\in\mathcal{I}_{g} where mim_{i}, nin_{i} indicate 𝐚i=Δ𝐚mi𝒜{\bf a}_{i}\overset{\Delta}{=}{\bf a}_{m_{i}}\in\mathcal{A} is a measurement of the same point as 𝐛i=Δ𝐛ni{\bf b}_{i}\overset{\Delta}{=}{\bf b}_{n_{i}}\in\mathcal{B}, and g\mathcal{I}_{g} is the set of all possible pair-wise correspondences. A general framework of ICP [10] is below:

min𝐓i=1N(𝐚i𝐓𝐛i)𝛀i(𝐚i𝐓𝐛i),\mathop{\min}_{\bf T}\sum\limits_{i=1}^{N}\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right)^{\top}\mathbf{\Omega}_{i}\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right), (1)

where 𝐓SE(3){\bf{T}}\in{\rm SE(3)} denotes a rigid transformation that maps \mathcal{B} to the 𝒜\mathcal{A}, namely, 𝐓𝐚i=𝐑𝐛i+𝐭{\bf T}{\bf a}_{i}={\bf R}{\bf b}_{i}+{\bf t}, 𝐑SO(3)\mathbf{R}\in{\rm SO(3)}, 𝐭3{\bf{t}}\in\mathbb{R}^{3} are rotation matrix and translation vector. 𝛀i\mathbf{\Omega}_{i} denotes the information matrix that captures the local geometrical shape of the point pair 𝐚i{\bf a}_{i} and 𝐛i{\bf b}_{i} and 𝐞i{\bf e}_{i} denotes the pair-wise error 𝐞i=𝐚i𝐓𝐛i{\bf e}_{i}={\bf a}_{i}-\mathbf{T}{\bf b}_{i}. The original version of ICP algorithm [10] defines 𝛀i=𝐈\mathbf{\Omega}_{i}=\bf{I}. For the variant who applies point-to-plane metric, it is the case where 𝛀i=𝐧i𝐧i\mathbf{\Omega}_{i}={\bf n}_{i}{\bf n}_{i}^{\top}, with 𝐧i3{\bf n}_{i}\in\mathbb{R}^{3} the surface normal of point 𝐚i{\bf a}_{i}.

The ICP algorithm can be summarized into two steps:

  1. 1.

    Correspondence step: is to find a putative correspondence between two scans.

  2. 2.

    Minimization step: is to compute a transformation via minimization of differences between corresponding points.

In practice, the exact correspondence in correspond step is usually unknown. But some approximations like nearest neighbor are proven to be effective under the premise of a good initial value. In the minimization step the difference between two point clouds are usually modeled by the sum of squared errors.

From a probabilistic perspective, the objective function of ICP can be reformulated to a well-defined parameter estimation problem and can be efficiently solved via maximum likelihood estimation (MLE). To be more specific, each pair-wise distance error is view as an independent measurement. A proper probabilistic distribution is utilized to generate all these measurements (noise modeling). Then minimizing the objective function (1) is equivalent to maximizing the log likelihood function.

We remark here that GICP, NICP and NDT [9] are within the aforementioned general ICP framework (1). With the noise modeling method of GICP and NDT being different.

II-B Drawbacks of Previous Methods

GICP attaches a probabilistic model to the minimization step. More explicitly, sets of underlying points are modeled by assigning Gaussian distributions to each of the points, 𝐚m𝒩(𝐚^m,𝚺m,A){\bf a}_{m}\sim\mathcal{N}\left(\mathbf{\hat{a}}_{m},\bm{\Sigma}_{m,A}\right), 𝐛n𝒩(𝐛^n,𝚺n,B){\bf b}_{n}\sim\mathcal{N}(\mathbf{\hat{b}}_{n},\bm{\Sigma}_{n,B}). 𝚺m,A\bm{\Sigma}_{m,A} and 𝚺n,B\bm{\Sigma}_{n,B} stand for covariance matrices of each centered point set. With a correspondence \mathcal{I} computed by nearest neighbor, the pair-wise error can be derived as 𝐞i𝒩(𝟎,𝚺i,A+𝐑𝚺i,B𝐑)\mathbf{e}_{i}\sim\mathcal{N}\left(\mathbf{0},\bm{\Sigma}_{i,A}+\mathbf{R}\bm{\Sigma}_{i,B}\mathbf{R}^{\top}\right). In this context, any pair-wise error is drawn from the above multi-variate Gaussian distribution. Therefore, with MLE, the negative log-likelihood function is derived:

𝐓=argmin𝐓i=1N𝐞i(𝚺i,A+𝐑𝚺i,B𝐑)1𝐞i,\mathbf{T}^{\ast}=\mathop{\arg\min}_{\mathbf{T}}\sum\limits_{i=1}^{N}\mathbf{e}_{i}^{\top}{\left(\mathbf{\Sigma}_{i,A}+\mathbf{R}\mathbf{\Sigma}_{i,B}\mathbf{R}^{\top}\right)^{-1}}\mathbf{e}_{i}, (2)

where the smallest eigenvalue of the two covariance matrices are replaced by a small constant ϵ\epsilon to force points to lie on a local planar.

Like many ICP variants, GICP employs heuristic correspondence (nearest neighbor) which is not robust to outliers. Moreover, the information matrix in (2) is computationally expensive. Also the Gaussian assumption is brittle in complex scenarios where large portion of outliers pervasively exist.

As for NDT, the algorithm firstly subdivide the reference scan space into a grid of cells. For each cell, a PDF is computed to approximate the local surface. Then the goal of NDT is to find the best pose transform 𝐓\mathbf{T^{\ast}} that maximize the likelihood of the the current scan points lie on the reference surface.

Importantly, the PDF of the cells is considered to be a mixture of normal distribution and a uniform distribution (in [11] Chapter 6) to take the distribution of outliers into consideration:

p~(𝐱)=c1𝒩(𝝁,Σ)+c2p0,\widetilde{p}({\bf x})=c_{1}\mathcal{N}(\bm{\mu},\Sigma)+c_{2}p_{0}, (3)

where c1c_{1} and c2c_{2} are constants and can be computed based on the user-specified outlier ratio.

NDT also neglects the importance of the correspondence step and adopts an implicit nearest neighbor. The spurious correspondence loses accuracy and the fixed portion of outlier makes NDT nonflexible.

III Proposed Algorithm

Our method improves upon both steps of ICP framework. Fundamentally, a similar probabilistic model of GICP is drawn to every single point. For the correspondence step, we propose a bidirectional correspondence to ameliorate heuristic correspondence. For the second step, we adopt a robust noise model and derive correntropy as the error metric.

III-A Re-Attaining Symmetry via Bidirectional Correspondence

We incorporate both point clouds local structure into the objective function in a two-way process, namely bidirectional correspondence. The bidirectional correspondence includes two parts: the bidirectional search and the bidirectional distance residual. Take 𝐚i{\bf a}_{i} in the source point cloud and its corresponding point 𝐛i{\bf b}_{i} in the target point cloud as an example. The bidirectional search is with the following mathematical formulation:

bi={mi,ni|mi=i,ni=𝒞f(i),𝐚𝒞b(i)𝐚i2<ϵ}i=1N,\displaystyle\begin{split}\mathcal{I}_{bi}=\left\{m_{i},n_{i}|m_{i}=i,n_{i}=\mathcal{C}^{*}_{f}(i),\left\|{\bf a}_{\mathcal{C}^{*}_{b}(i)}-{\bf a}_{i}\right\|_{2}<\epsilon\right\}_{i=1}^{N},\end{split} (4)

where bi\mathcal{I}_{bi} is an association variable denoting a set of putative correspondences, mi,nim_{i},n_{i} are indices of the corresponding points and ϵ\epsilon is an Euclidean distance bound. We adopt the following formulation to build the forward correspondence:

𝒞f(i)=argmin𝒞f(i){1,2,3,Ns}(𝐚i𝐓𝐛𝒞f(i)22),\mathcal{C}^{*}_{f}(i)=\mathop{\arg\min}_{\mathcal{C}_{f}(i)\in\left\{1,2,3,\cdots N_{s}\right\}}\left(\left\|{\bf a}_{i}-\mathbf{T}{\bf b}_{\mathcal{C}_{f}(i)}\right\|_{2}^{2}\right), (5)

where 2\left\|\cdot\right\|_{2} is the Euclidean norm. Conversely, we can compute the backward correspondence 𝒞b(i)\mathcal{C}^{*}_{b}(i) based on the acquisition of the forward correspondence as:

𝒞b(i)=argmin𝒞b(i){1,2,3,Nt}(𝐚𝒞b(i)𝐓𝐛𝒞f(i)22).\mathcal{C}_{b}^{*}(i)=\mathop{\arg\min}_{\mathcal{C}_{b}(i)\in\left\{1,2,3,\cdots N_{t}\right\}}\left(\left\|{\bf a}_{\mathcal{C}_{b}(i)}-\mathbf{T}{\bf b}_{\mathcal{C}^{*}_{f}(i)}\right\|_{2}^{2}\right). (6)

By taking this measure, quantity of involving points is reduced while correspondence is also ameliorated.

For the minimization step, to incorporate local information into the objective function, we propose the bidirectional distance residual. Firstly, we adopt the same approximation as NICP [3]: neglect the covariance matrix of the source point cloud. This leads to a fixed information matrix during the whole registration process. We denote bi\mathcal{I}_{bi} as the result of the bidirectional search and the uni-directional distance objective function can be formulated as:

𝒥uni(i)=(𝐚i𝐓𝐛i)𝛀i,A(𝐚i𝐓𝐛i).\mathcal{J}_{uni}(i)=\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right)^{\top}\mathbf{\Omega}_{i,A}\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right). (7)

Inversely, to incorporate the structure of the target point cloud, the following bidirectional distance objective function is derived:

𝒥bi(i)\displaystyle\mathcal{J}_{bi}(i) =(𝐚i𝐓𝐛i)𝛀i,A(𝐚i𝐓𝐛i)\displaystyle=\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right)^{\top}\mathbf{\Omega}_{i,A}\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right) (8)
+(𝐛i𝐓1𝐚i)𝛀i,B(𝐛i𝐓1𝐚i).\displaystyle+\left({\bf b}_{i}-\mathbf{T}^{-1}{\bf a}_{i}\right)^{\top}\mathbf{\Omega}_{i,B}\left({\bf b}_{i}-\mathbf{T}^{-1}{\bf a}_{i}\right). (9)

With 𝐛i𝐓1𝐚i=𝐓1(𝐓𝐛i𝐚i){\bf b}_{i}-\mathbf{T}^{-1}{\bf a}_{i}=\mathbf{T}^{-1}\left(\mathbf{T}{\bf b}_{i}-{\bf a}_{i}\right), the above equation can be simplified as:

𝒥bi(i)=(𝐚i𝐓𝐛i)(𝛀i,A+𝐑𝛀i,B𝐑)(𝐚i𝐓𝐛i),\mathcal{J}_{bi}(i)\!=\!\left({\bf a}_{i}\!-\!\mathbf{T}{\bf b}_{i}\right)^{\top}\left(\mathbf{\Omega}_{i,A}\!+\!{\bf R}\mathbf{\Omega}_{i,B}{\bf R}^{\top}\right)\left({\bf a}_{i}\!-\!\mathbf{T}{\bf b}_{i}\right), (10)

with

𝛀i=𝛀i,A+𝐑𝛀i,B𝐑\mathbf{\Omega}_{i}=\mathbf{\Omega}_{i,A}+{\bf{R}}\mathbf{\Omega}_{i,B}{\bf{R}}^{\top} (11)

being the bidirectional information matrix. Recall the information matrix of GICP is:

𝛀iG=(𝛀i,A1+𝐑𝛀i,B1𝐑)1.\mathbf{\Omega}_{i}^{G}=\left(\mathbf{\Omega}_{i,A}^{-1}+{\bf R}\mathbf{\Omega}_{i,B}^{-1}{\bf R}^{\top}\right)^{-1}. (12)

More explicitly, the bidirectional residual is achieved as follows: by fixing the target point cloud and applying the transformation to the source point cloud, we get the forward residual error and by fixing the source point cloud and applying the inverse transformation to the target point cloud, we get the backward residual error. Compare (12) with (11), the latter can be view as an approximation of the original GICP one. We avoid to compute one time of matrix inversion with the aforementioned approximation and ease the computation of gradient.

The intuition of the above formation comes from a neutrality thinking. Given a correspondence between a point-pair, it is not reasonable to emphasize any of them. The source point cloud contains equivalent amount of information with respect to the target point cloud. Both of the two point clouds should be treated equally and have an equal contribution to the final objective function. Consequently, information of both point clouds have been considered and a bidirectional correspondence have been established.

III-B Correntropy-Based Noise Modelling

In most variants of ICP, mean squared error (MSE) is used to measure the the residual error. These methods cannot efficiently resist the presence of heavy outlier. Fig. 2(a) illustrates an MSE-kind error function in the joint space. MSE is a quadratic function, it takes small values in the bottom of the valley and equals 0 on the x=yx=y line. But with the difference between xx and yy getting bigger, the MSE function rises quadratically and have an amplification effect on the residual error. This makes MSE capable of dealing with distributions like Gaussian but not proper for heavy-tail distributions. The latter is exactly the case when outliers present in points clouds[12].

Refer to caption
(a) MSE
Refer to caption
(b) Correntropy
Figure 2: Illustration of MSE and Correntropy. Compared to MSE, correntropy is not sensible to outliers as it drops steadily while being far away from the central line.

An illustration of correntropy is shown in Fig. 2(b) in the joint space. As a matter of fact, correntropy emphasizes the error close to the x=yx=y line, and drops steadily while being far away from it, which means correntropy can restrict the residual error from been excessively amplified and is more suitable for being a similarity measure for non-Gaussian variables.

As stated by Hasanbelliu et al. in [12], with a finite number of points {𝐱i,𝐲i}i=1:N\left\{{\bf x}_{i},{\bf y}_{i}\right\}_{i=1:N}, a sample estimator of correntropy can be drawn as:

v^(X,Y)=1Ni=1NGσ(𝐱i𝐲i22),\hat{v}\left(X,Y\right)=\frac{1}{N}\sum\limits_{i=1}^{N}G_{\sigma}\left({\left\|{\bf x}_{i}-{\bf y}_{i}\right\|_{2}^{2}}\right), (13)

where Gσ(r2)=12πσexp(r22σ2)G_{\sigma}\left(r^{2}\right)=\frac{1}{\sqrt{2\pi}\sigma}\exp{\left(-\frac{r^{2}}{2\sigma^{2}}\right)} is the density function of the Gaussian kernel with the bandwidth parameter σ\sigma. For more information about the derivation, please refer to [12].

Correntropy is known to be robust to non-Gaussian noise and it has been introduced to the ICP algorithm [13]. Though they are widely used in robotics, the concept of entropy and parameter tuning are often confusing and can be challenging in real world applications. However, we discover that correntropy is equivalent to noise modeling in NDT. Therefore, it is possible to apply knowledge relating to NDT to correntropy metrics directly.

If we consider a pair of points (𝐚i,𝐛i)\left({\bf a}_{i},{\bf b}_{i}\right) is associated by bidirectional search bi\mathcal{I}_{bi}. The relationship between the two points can be drawn as 𝐚i=𝐓𝐛i+ϵi{\bf a}_{i}=\mathbf{T^{\ast}}{\bf b}_{i}+{\bm{\epsilon}}_{i}, where ϵi{\bm{\epsilon}}_{i} is the sensor error. It is well-handled by the noise modeling of GICP. While the point-pair belongs to inlier, the Gaussian assumption of the pair-wise residual error is valid. However, when one of them belongs to outlier, the pair-wise residual error 𝐞i=𝐚i𝐓𝐛iϵi{\bf e}_{i}={\bf a}_{i}-\mathbf{T^{\ast}}{\bf b}_{i}-{\bf\epsilon}_{i} cannot be properly modeled by a single Gaussian distribution, but a mixture of a Gaussian and a uniform distribution:

𝐞i𝒩(𝟎,𝛀𝐢)+𝒰i,{\bf e}_{i}\sim\mathcal{N}\left({\bf 0},\bf{\Omega}_{i}\right)+\mathcal{U}_{i}, (14)

or equivalently as:

pmix(𝐞i)=c1exp(𝐞i𝛀𝐢𝐞𝐢2)+c2p0,p_{mix}\left({\bf e}_{i}\right)=c_{1}\exp{\left(-\frac{{\bf e}_{i}^{\top}\bf{\Omega}_{i}{\bf e}_{i}}{2}\right)}+c_{2}p_{0}, (15)

where p0p_{0} is the ratio of outlier. Using MLE, the objective function can be derived as:

𝐓=argmax𝐓i=1Npmix(𝐞i)=argmin𝐓i=1Nlogpmix(𝐞i)\mathbf{T}^{\ast}\!=\!\mathop{\arg\max}_{\mathbf{T}}\prod\limits_{i=1}^{N}p_{mix}\left({\bf e}_{i}\right)\!=\!\mathop{\arg\min}_{\bf T}\sum\limits_{i=1}^{N}-\log{p_{mix}\left({\bf e}_{i}\right)} (16)

The PDF in (15) do not have a good mathematical form in the optimization perspective as the formed objective function has no simple first- and second-derivatives therefore we use the approximation provided by [9] to the negative log-likelihood in (16):

log(pmix(𝐞i))d1exp(d2𝐞i𝛀𝐢𝐞𝐢2)+d3,-\log{\left(p_{mix}\left({\bf e}_{i}\right)\right)}\approx d_{1}\exp\left(-d_{2}\frac{{\bf e}_{i}^{\top}\bf{\Omega}_{i}{\bf e}_{i}}{2}\right)+d_{3}, (17)

where d1d_{1}, d2d_{2} and d3d_{3} are computed by requiring the approximation function behaves similarly to logpmix(𝐞i)-\log{p_{mix}\left({\bf e}_{i}\right)} at extreme values. Fig. 3(b) shows shows the validity of the aforementioned approximation (the green line).

The parameter d3d_{3} can be omitted as it is just an offset and has no affect on solving the minimum of a summation objective function in (16).

Refer to caption
(a) Likelihood
Refer to caption
(b) Negative log-likelihood
Figure 3: Illustration of Likelihood and Negative log-likelihood. The green line shows the effectiveness of the Gaussian approximation to pmix(x)p_{mix}(x).

Comparing (17) to (13), they have a same kind of mathematical formulation with a scaled square of the Euclidean distance inside of the exp term. The bandwidth parameter σ\sigma in (13) plays the same role as d2d_{2} in (17). For NDT, the outlier ratio is fixed, which is equivalent to a fixed σ\sigma in correntropy. So we draw the conclusion that correntropy is equivalent to noise modeling in NDT. Thanks to the insight in optimization in correntropy based applications, the σ\sigma should be gradually changed during the optimization, which is equivalent to adaptively tuning the outlier ratio for NDT. Therefore, the proposed CoBigICP has the potential to be more robust and accurate than NDT.

To recap briefly, the pair-wise error can be view as Gaussian if they are inliers and cannot if they are outliers. Since we can never have the perfect correspondence, the error distribution may not be Gaussian and MSE is not the optimal. We use a more delicate noise model that is a mixture of a Gaussian distribution and a uniform distribution similar to NDT. As proved before, this is actually a correntropy-based noise modeling technique. So, with the bidirectional search bi\mathcal{I}_{bi} the objective function of the proposed method is formulated as:

𝐓=argmax𝐓i=1NGσ(ri),ri2=(𝐚i𝐓𝐛i)𝛀𝐢(𝐚𝐢𝐓𝐛𝐢).\displaystyle\begin{split}\mathbf{T}^{\ast}&=\mathop{\arg\max}_{\mathbf{T}}\sum\limits_{i=1}^{N}G_{\sigma}(r_{i}),\\ r_{i}^{2}&={\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right)^{\top}\bf{\Omega}_{i}\left({\bf a}_{i}-\mathbf{T}{\bf b}_{i}\right)}.\end{split} (18)

Note that the sign of the regularization term is changed due to the different sign of MSE and correntropy. The bandwidth parameter σ\sigma can be tuned easily to model the portion of outlier. As shown in Hasanbelliu’s work [12], one feasible approach would be to change the kernel size with a decay rate during optimization. To this end, our approach is more flexible than a fixed portion of outlier.

III-C Lie-Algebra-Based Solver

The point set registration problem is finally converted to optimize a smooth function on a differentiable manifold. To solve the optimization problem in (18), traditional approaches are based on gradient-descent and these methods usually have slow rate of convergence. We propose a lie-algebra-based solver that can convert the problem into an iteratively reweighted least squares (IRLS) and the solution can be found in analytical form. The proposed solving process is based the standard workflow of Riemannian based optimization method in [14]. The solution 𝐓\mathbf{T} is actually on a Riemannian manifold: special Euclidean group SE(3)SE(3) and the rotation matrix 𝐑\bf{R} belongs to special orthogonal group SO(3), which can be defined as follows: SO(3)=def{𝐑3×3:𝐑𝐑=𝐈,det(𝐑)=1}SO(3)\overset{\text{def}}{=}\left\{{\bf R}\in\mathbb{R}^{3\times 3}:{\bf R}^{\top}{\bf R}={\bf{I}},det({\bf R})=1\right\}, SE(3)=def{𝐓(𝐑𝐭01):𝐑SO(3),𝐭3}SE(3)\overset{\text{def}}{=}\left\{\mathbf{T}\in\begin{pmatrix}{\bf R}&{\bf t}\\ 0&1\end{pmatrix}:{\bf R}\in SO(3),{\bf t}\in\mathbb{R}^{3}\right\}. The group SO(3)SO(3) forms a smooth manifold. The tangent space of the manifold is denoted as 𝐬𝐨(𝟑)𝟑\bf{so}(3)\in\mathbb{R}^{3}, which is also called the Lie Algebra (𝐬𝐞(𝟑)𝟔\bf{se}(3)\in\mathbb{R}^{6} as well). 𝐬𝐨(𝟑)\bf{so}(3) also coincides with 3×33\times 3 skew symmetric matrices, denoted by the hat operator:

ϕ=[ϕ1ϕ2ϕ3]=[0ϕ3ϕ2ϕ30ϕ1ϕ2ϕ10]so(3)\displaystyle{\bf\phi}^{\wedge}=\begin{bmatrix}\phi_{1}\\ \phi_{2}\\ \phi_{3}\end{bmatrix}^{\wedge}=\begin{bmatrix}0&-\phi_{3}&\phi_{2}\\ \phi_{3}&0&-\phi_{1}\\ -\phi_{2}&\phi_{1}&0\end{bmatrix}\in so(3) (19)

A standard approach for optimization problem on manifold [14] consists of determining a retraction 𝐓\mathcal{R}_{\mathbf{T}}. The above function is reparametrized as:

max𝐓SE(3)f(𝐓)maxδ𝐱nf(𝐓(δ𝐱)).\mathop{\max}_{\mathbf{T}\in SE(3)}f({\bf T})\Rightarrow\mathop{\max}_{\delta{\bf x}\in\mathbb{R}^{n}}f(\mathcal{R}_{\mathbf{T}}(\delta{\bf x})). (20)

As stated by Forster in [15], the use of the retraction allows framing the optimization problem over an Euclidean space of suitable dimension, e.g. δ𝐱3\delta{\bf x}\in\mathbb{R}^{3} when we work in SO(3)SO(3) and δ𝐱6\delta{\bf x}\in\mathbb{R}^{6} when we work in SE(3)SE(3). Retraction is actually a bijective map between a perturbation δ𝐱\delta{\bf x} of the tangent space at 𝐱\bf x and a neighborhood of 𝐱\bf x on the manifold. And after all, the right hand side of (20) can be formulated as an IRLS. By solving the IRLS, we get a vector δ𝐱\delta{\bf x}^{\ast} in the tangent space. Then the current guess on the manifold, which is 𝐱\bf x, can be updated.

The exponential map associates the lie algebra to a rotation has a first-order approximation: exp(ϕ)𝐈+ϕexp({\bm{\phi}}^{\wedge})\approx\bf{I}+{\bm{\phi}}^{\wedge}. Exponential map can be a retraction. It is not the computational optimal according to [15]. In this paper, we use the following retraction for SE(3)SE(3):

𝐓(δ𝐱)=(exp(ξ)𝐑,exp(ξ)𝐭+Δ𝐭),\mathcal{R}_{\mathbf{T}}(\delta{\bf x})=\left(exp({\bf\xi}^{\wedge}){\bf R},exp({\bf\xi}^{\wedge}){\bf t}+\Delta{\bf t}\right), (21)

where δ𝐱=[ξΔ𝐭]6\delta{\bf x}=\begin{bmatrix}{\bf\xi}\\ \Delta{\bf t}\end{bmatrix}\in\mathbb{R}^{6}. 𝐓(δ𝝃,δ𝐭)\mathcal{R}_{\bf T}(\delta{\bm{\xi}},\delta{\bf t}) means we use retraction at 𝐓(𝐑,𝐭){\bf T}\doteq({\bf R},{\bf t}).

By applying the first-order approximation: exp(ϕ)𝐈+ϕexp({\bf\phi}^{\wedge})\approx\bf{I}+{\bf\phi}^{\wedge} to (21), we get the perturbed version of 𝐑\bf R and 𝐭\bf t:

𝐓(δ𝐱)=((𝐈+ξ)𝐑,(𝐈+ξ)𝐭+Δ𝐭).\mathcal{R}_{\bf T}(\delta{\bf x})=\left(\left({\bf{I}}+{\bf\xi}^{\wedge}\right){\bf R},\left({\bf{I}}+{\bf\xi}^{\wedge}\right){\bf t}+\Delta{\bf t}\right). (22)

We can now reparametrize the pair-wise error in (18):

𝐚i𝐑𝐛i𝐭=𝐚i(𝐈+ξ)𝐑𝐛i(𝐈+ξ)𝐭Δ𝐭=𝐯i+𝐇iδ𝐱,\displaystyle\begin{split}{\bf a}_{i}-{\bf R}{\bf b}_{i}-{\bf t}&={\bf a}_{i}-\left({\bf{I}}+{\bf\xi}^{\wedge}\right){\bf R}{\bf b}_{i}-\left({\bf{I}}+{\bf\xi}^{\wedge}\right){\bf t}-\Delta{\bf t}\\ &={\bf v}_{i}+{\bf H}_{i}\delta{\bf x},\end{split} (23)

with 𝐯𝐢=𝐚𝐢𝐑𝐛𝐢𝐭,𝐇𝐢=[(𝐑𝐛𝐢+𝐭)𝐈]\bf{{\bf v}}_{i}={\bf a}_{i}-{\bf Rb}_{i}-{\bf t},{\bf H}_{i}=\begin{bmatrix}({\bf Rb}_{i}+{\bf t})^{\wedge}-{\bf{I}}\end{bmatrix}. In the derivation of (23), the property of cross product is used: 𝐚𝐛=𝐛𝐚{\bf a}^{\wedge}{\bf b}=-{\bf b}^{\wedge}{\bf a}. Now that (18) has the following form:

δ𝐱=argmax𝐱i=1NwG(𝐯i+𝐇i𝐱)T𝛀𝐢(𝐯𝐢+𝐇𝐢𝐱)=argmax𝐱(𝐱T𝐀𝐱+2𝐛T𝐱+const)=𝐀𝐛,\displaystyle\begin{split}\delta{\bf x}^{\ast}&=\mathop{\arg\max}_{{\bf x}}\sum\limits_{i=1}^{N}w_{G}{\left({\bf v}_{i}+{\bf H}_{i}\bf{x}\right)^{\mathrm{T}}\bf{\Omega}_{i}\left({\bf v}_{i}+{\bf H}_{i}\bf{x}\right)}\\ &=\mathop{\arg\max}_{{\bf x}}\left({\bf x}^{\mathrm{T}}\mathbf{A}{\bf x}+2{\bf b}^{\mathrm{T}}{\bf x}+const\right)\\ &=-\mathbf{A}^{\dagger}{\bf b},\end{split} (24)

where 𝐀=i=1NwG𝐇iT𝛀𝐢𝐇𝐢\mathbf{A}=\sum\limits_{i=1}^{N}w_{G}{\bf H}_{i}^{\mathrm{T}}\bf{\Omega}_{i}{\bf H}_{i}, 𝐛=i=1NwG𝐇iT𝛀𝐢𝐯𝐢{\bf b}=\sum\limits_{i=1}^{N}w_{G}{\bf H}_{i}^{\mathrm{T}}\bf{\Omega}_{i}{\bf v}_{i} and wG=Gσ(ri)w_{G}=G_{\sigma}\left(r_{i}\right) is the weight function (13).

Importantly, 𝐀\mathbf{A}^{\dagger} denotes the pseudoinverse matrix of 𝐀\mathbf{A}. This aims at dealing with the situation where 𝐀\mathbf{A} being rank-deficient.

The derivation of the above function is based on the assumption that in each iteration, the rotation matrix in the information matrix in (18) can be approximate by the result of the last iteration. With this assumption, the objective function of the Lie Algebra based optimization problem remains quadratic. Consequently, the result is in an elegant closed-form and there’s no need to use gradient based methods. After we get δ𝐱\delta{\bf x}^{\ast}, the current estimation on SE(3)SE(3) can be updated by using the retraction (21) again.

Refer to caption
(a) ECDF of translation error
Refer to caption
(b) ECDF of rotation error
Figure 4: The empirical cumulative distribution function (ECDF) of different methods in ETH Hauptgebaude. The x axis represents the translation error or the rotation error while the corresponding y axis represents the probability of producing this error.

IV Experiments

We conduct a variety of experiments to test the robustness, accuracy and efficiency of the proposed method and to compared our system with several state-of-the-art methods. A group of well-known and public available data sets are used: the ETH Challenging data sets [16].

To measure the performance of an algorithm, we use the protocol provided by [1]. More specifically, the difference between the estimated transformation matrix 𝐓^\mathbf{\hat{T}} and the ground truth transformation matrix 𝐓\mathbf{T} is computed by Δ𝐓=𝐓^𝐓1\Delta\mathbf{T}=\mathbf{\hat{T}}\cdot\mathbf{T}^{-1}. Then the translation error, also known as Relative Pose Error (RPE), and rotation error is computed by the following equations: etrans=Δx2+Δy2+Δz2e_{trans}=\sqrt{\Delta{x}^{2}+\Delta{y}^{2}+\Delta{z}^{2}} and erot=arccos(trace(Δ𝐓)21)e_{rot}=\arccos{\left(\frac{trace(\Delta{{\bf T}})}{2}-1\right)}, where (Δx,Δy,Δz)\left(\Delta{x},\Delta{y},\Delta{z}\right) are the elements on the last column.

The proposed algorithm was implemented in MATLAB (v2019b) on a PC with Intel Core i5-8300H @ 2.30GHz and 8G RAM. For fair comparison, we use the NDT implementation in MATLAB. Since GICP is closely related to our algorithm, we use our own implementation with reference to the version in PCL. We remark that GICP shares the same surface statistics (covariance matrix, information matrix and surface normal) with CoBigICP. For other involved methods, we use the implementation that the authors provide. In all experiments, as mentioned in the protocol of the data set, point clouds are down-sampled to 10%\%. For the bandwidth parameter σ\sigma, we adopt result in [12] with a decay rate of 0.97.

IV-A Consecutive Registration and Ablation Study

In this section we conduct experiments of all methods on a series of consecutive pairs of scans in two data sets to evaluate their performances under both structured and semi-structured environment:

  • -

    𝑬𝑻𝑯𝑯𝒂𝒖𝒑𝒕𝒈𝒆𝒃𝒂𝒖𝒅𝒆\bm{ETHHauptgebaude} collects lidar frames along a corridor with repetitive pillars and arches.

  • -

    𝑮𝒂𝒛𝒆𝒃𝒐𝑺𝒖𝒎𝒎𝒆𝒓\bm{GazeboSummer} is a semi-structured environment, with a mixture of man-made structures, vegetation and some dynamic objects like the walking people.

We compared CoBigICP with GICP, NDT and MiNoM [6] by computing the RPE for each consecutive pair. Main results are shown in table I and II. Note that the method CoGICP aims at ablation study which discards the bidirectional correspondence and involves only the correntropy metric. Compared to other methods, CoBigICP is the most competitive with the best accuracy and favorable time consumption. In the ETH Hauptgebaude data set, CoBigICP far more exceeds the other algorithms and this phenomenon supports the importance of correspondence step while also validating our bidirectional correspondence that we propose in this paper. The result of Gazebo Summer shows CoBigICP is also capable for semi-structured environment. The ablation study between CoGICP and CoBigICP shows effectiveness of bidirectional correspondence in both accuracy and speed. An illustrative reconstruction result is shown in Fig. 1. We compute the pose estimation using CoBigICP and we put the transformed point cloud together. The reconstruction result (colored points) is consistent with the ground truth (white points).

IV-B Convergence basin under perturbations

In this section, 6720 scan pairs from ETH Hauptgebaude (not necessarily consecutive) with different initial pose perturbations provided by Pomerleau et al. [1] are used to evaluate the robustness or convergence basin of algorithms. The different hardness of perturbations are indicated by easyPose, midiumPose and hardPose. These perturbations makes the registration challenging and meaningful. To lead a fair comparison, results of d2dndt, p2dndt and trimmed-pt2pl [7] are provided by the authors and are downloaded directly from the website. Results in Fig. 4 show that CoBigICP has the widest translation convergence basin while being also competitive with rotation perturbation. The phenomenon is consistent with our analysis in Section III B.

TABLE I: Results for ETH Hauptgebaude
Methods GICP NDT MiNoM CoGICPa CoBigICP
etranse_{trans}[m]\downarrow 0.03 0.16 0.46 0.02 0.005\bm{0.005}
erote_{rot}[degree]\downarrow 0.19 9.08 0.32 0.19 0.15\bm{0.15}
time[s]\downarrow 2.10 2.94 2.09 1.55 0.64\bm{0.64}
aFor ablation study.
TABLE II: Results for Gazebo Summer
Methods GICP NDT MiNoM CoGICPa CoBigICP
etranse_{trans}[m]\downarrow 0.10 0.16 0.36 0.05\bm{0.05} 0.06
erote_{rot}[degree]\downarrow 1.76 9.86 3.19 0.34 0.21\bm{0.21}
time[s]\downarrow 2.43 2.32 2.00 1.08 0.87\bm{0.87}
aFor ablation study.

V Conclusions

In this paper, we present a novel point cloud registration algorithm based on bidirectional correspondence and correntropy. Extensive comparative experiments showing our method reaches state-of-the-art precision and offers better robustness even with poor initial values.

In the future, we plan to study the adaptive estimation of outlier ratio by the robot’s travelling distance between consecutive frames and predictable number of dynamic objects in the robot’s working environments. And finally remove the trade off between parameter selection and the performance.

References

  • [1] F. Pomerleau, F. Colas, R. Siegwart, and S. Magnenat, “Comparing icp variants on real-world data sets,” Auton. Robots, vol. 34, no. 3, p. 133–148, Apr. 2013. [Online]. Available: https://doi.org/10.1007/s10514-013-9327-2
  • [2] A. Segal, D. Hähnel, and S. Thrun, “Generalized-icp,” in Proc. of Robotics: Science and Systems, 06 2009.
  • [3] J. Serafin and G. Grisetti, “Nicp: Dense normal based point cloud registration,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sep. 2015, pp. 742–749.
  • [4] W. Tabib, C. O’Meadhra, and N. Michael, “On-manifold gmm registration,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3805–3812, Oct 2018.
  • [5] S. A. Parkison, L. Gan, M. G. Jadidi, and R. M. Eustice, “Semantic iterative closest point through expectation-maximization.” in BMVC, 2018, p. 280.
  • [6] D. Wang, J. Xue, Z. Tao, Y. Zhong, D. Cui, S. Du, and N. Zheng, “Accurate mix-norm-based scan matching,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct 2018, pp. 1665–1671.
  • [7] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek, “The trimmed iterative closest point algorithm,” in Object recognition supported by user interaction for service robots, vol. 3, Aug 2002, pp. 545–548 vol.3.
  • [8] A. Zaganidis, L. Sun, T. Duckett, and G. Cielniak, “Integrating deep semantic segmentation into 3-d point cloud registration,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 2942–2949, 2018.
  • [9] P. Biber and W. Strasser, “The normal distributions transform: a new approach to laser scan matching,” in Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453), vol. 3, Oct 2003, pp. 2743–2748 vol.3.
  • [10] P. J. Besl and N. D. McKay, “A method for registration of 3-d shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239–256, Feb 1992.
  • [11] M. Magnusson, “The three-dimensional normal-distributions transform: an efficient representation for registration, surface analysis, and loop detection,” Ph.D. dissertation, Örebro universitet, 2009.
  • [12] E. Hasanbelliu, L. S. Giraldo, and J. C. Príncipe, “Information theoretic shape matching,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 12, pp. 2436–2451, Dec 2014.
  • [13] S. Du, G. Xu, S. Zhang, X. Zhang, Y. Gao, and B. Chen, “Robust rigid registration algorithm based on pointwise correspondence and correntropy,” Pattern Recognition Letters, 2018.
  • [14] P.-A. Absil, C. G. Baker, and K. A. Gallivan, “Trust-region methods on riemannian manifolds,” Foundations of Computational Mathematics, vol. 7, no. 3, pp. 303–330, 2007.
  • [15] C. Forster, L. Carlone, F. Dellaert, and D. Scaramuzza, “On-manifold preintegration for real-time visual–inertial odometry,” Trans. Rob., vol. 33, no. 1, p. 1–21, Feb. 2017. [Online]. Available: https://doi.org/10.1109/TRO.2016.2597321
  • [16] F. Pomerleau, M. Liu, F. Colas, and R. Siegwart, “Challenging data sets for point cloud registration algorithms,” The International Journal of Robotics Research, vol. 31, no. 14, pp. 1705–1711, Dec. 2012.