LIO-GVM: an Accurate, Tightly-Coupled Lidar-Inertial Odometry with Gaussian Voxel Map
Abstract
This letter presents a probabilistic voxel-based LiDAR Inertial Odometry framework for accurate and robust pose estimation. The framework addresses the correspondence mismatching issue by representing the LiDAR points as a set of Gaussian distributions and evaluating the divergence in variance for outlier rejection. Based on the fitted distributions, a new residual metric is proposed for the filter-based Lidar inertial odometry by incorporating both the distance and variance disparities, further enriching the comprehensiveness and accuracy of the residual metric. With the strategic design of the residual, we propose a simple yet effective voxel-solely mapping scheme, which only requires the maintenance of one centroid and one covariance matrix for each voxel. Experiments on different datasets demonstrate the robustness and high accuracy of our framework for various data inputs and environments. To the benefit of the robotics society, we open-source the code at https://github.com/Ji1Xingyu/lio_gvm.
Index Terms:
LiDAR inertial odometry, SLAM, probabilistic feature association, voxel mapI Introduction
The vital requirement for an autonomous robot is to estimate its ego-motion and build a map of the environment without prior knowledge, which is widely known as Simultaneous Localization and Mapping (SLAM). In particular, the LiDAR and IMU sensors are commonly employed together [1, 2, 3, 4, 5, 6] in SLAM problem to form a LiDAR Inertial Odometry (LIO) because of their complementary ability: LiDAR contributes accurate range measurements for the environment, while IMU provides precise and surrounding-insensitive short-term motion information [7]. Because of the temporal and computational efficiency, filter-based LIO [3, 4, 5, 6] has been investigated in the recent few years. The performance of a LiDAR odometry is predominantly influenced by two components: point cloud registration and mapping scheme [8]. In this letter, we aim to enhance the performance of a filter-based LIO system by addressing improvements to the shortcomings of the existing works in these two areas.

Point cloud registration refers to aligning the source LiDAR scan with a target map together, namely, to estimate the optimal transformation between their local coordinates. Early works extract the geometric feature points from the input scan and match them with the nearest point in the target map [9, 10, 11]. A limitation of these methods is their deterministic target map representation, which cannot account for measurement uncertainties, resulting in inaccuracies in mapping and pose estimation. To address this issue, subsequent works model the target map in a probabilistic way, such as planes [12] or surfels [13, 14, 15]. The matched correspondences with large uncertainty will be depressed in the pose estimation phase. Nevertheless, they still employ the nearest correspondence matching. The key limitation of this matching scheme is its inability to reject false positive correspondences, which occurs when the source point is matched with the wrong target feature but identified as a valid pair. This issue can negatively affect the pose estimation accuracy.
The mapping scheme acts as the interdependent component with point cloud registration, which maintains the target map for efficient environment representation. A common method is to maintain the target map with a k-d tree [9, 10, 11] as it can provide strict K nearest neighbor (k-NN) search for the correspondence matching. The time complexity for the deletion, insertion, and query operations of the k-d tree is , which can impact real-time performance when dealing with a significantly large target map. To address this problem, some systems opt to govern the target map with hash-based voxels [5, 12, 16], with the time complexity for element operation. However, due to the requirement of the front end, these voxel-based maps need to embed a supplemental data structure (k-d tree, Hilbert Curve, etc.) within voxels, which cannot reach the optimal operation speed.
To cope with the problem of mismatch in point cloud registration, we represent the points in both the source and target clouds as Gaussian distributions [17, 18, 19] and evaluate the matched correspondences by the disparity in variance. Moreover, based on the distribution assumption, we propose a new residual metric for filter-based LIO. The formula accesses both the distance and variance disparities, further enriching the comprehensiveness and accuracy of the residual metric. For the mapping side, instead of storing all the points, we propose an incremental voxel map without any additional structures, which only maintains a centroid and covariance matrix for each voxel and achieves time complexity.
In conclusion, we implement the proposed point cloud registration and mapping scheme to a filter-based LIO system denoted as LIO-GVM. The contributions can be summarized as follows:
-
•
We propose a new correspondence matching method and residual metric for filter-based LIO. This matching scheme can reject false positive correspondences. Additionally, the metric measures not only distance but also variance disparity, enhancing the comprehensiveness and accuracy of the residual formulation.
-
•
We propose a simple yet efficient mapping scheme. The map only necessitates the maintenance of the centroid and covariance matrix for each voxel, which can achieve time complexity for element operation.
-
•
We implement the two key techniques into a filter-based LIO system and evaluate its robustness and efficiency through various experiments (see Fig. 1). To the benefit of the robotics community, our implementation code is open-sourced at https://github.com/Ji1Xingyu/lio_gvm.
II Related Work
For point cloud registration, LOAM [9] first proposes to extract planar and edge features from the source and target clouds for fast registration and achieves accurate results. LeGO-LOAM [10] performs Euclidean cluster first to extract the ground plane and restrict the correspondence match in the same cluster. Instead of extracting the geometric features from the source scan, some systems directly register the raw source cloud into the target map [4, 3]. SuMa [13] and SLICT [14] render points in the target map into surfels and consider the point-to-surfel distance as residual. Similarly, VoxelMap [12] proposes to maintain a probabilistic plane-based target map and register the point in the source scan to the map. These point-to-surfel (or plane) residuals consider the noise in the target map and thus can depress the influence of matched pairs with high uncertainty in the pose estimation phase. Nevertheless, they cannot handle the false positive correspondence problem. On the contrary, our proposed matching scheme can evaluate the similarity of the matched distributions and reject the false positive pairs.
The proposed residual metric is a weighted distribution-to-distribution (D2D) distance and is similar to the D2D residual in GICP [18] and D2D-NDT [20]. The key distinction lies in the proposed similarity-based weighting scheme, which can adjust the weight of the residual for matched correspondences based on their divergence in covariance. Subsequently, the proposed residual metric can provide a more comprehensive and accurate pose estimation. Similarly, LiTAMIN2 [21] introduces symmetric KL-Divergence to the residual metric for divergence evaluation. However, their proposed formulation cannot be directly deployed in filter-based pose estimation.
For mapping schemes, a common method is to maintain the target map with a k-d tree [9, 11]. The conventional k-d tree needs to rebuild the entire tree for the map update, which is inefficient and time-consuming. FAST-LIO2 [4] uses the ikd-Tree [22], a dynamic k-d tree that can incrementally update the map. VoxelMap, Faster-LIO [5], and SuperOdom [16] govern the target map with hash-based voxels. However, due to the requirement of the front end, these voxel-based mapping schemes need to embed a supplemental data structure within voxels to form a hierarchical structure for the maintenance of the raw points. Instead, the proposed mapping scheme only requires the maintenance of the centroid and covariance matrix for each voxel, resulting in a time complexity of for element operation.
III Preliminary
Notation | Explanation |
---|---|
The Global, IMU, and LiDAR frame | |
The end time of the LiDAR scan | |
The sample time of IMU input, N.B. | |
The transformation matrix of w.r.t. | |
The rotation matrix and translation vector of w.r.t. | |
The point coordinate with reference to at time | |
The coordinate of the voxel with reference to | |
The centroid vector and covariance matrix of voxel at | |
The skew-symmetric matrix and transpose of vector |
III-A Notation
The important notations used in this letter are listed in Tab. I. Besides, we assume a static calibrated extrinsic matrix from IMU frame to LiDAR frame.
III-B System Description
The proposed LIO-GVM is modeled as a discrete-time dynamical system sampled at the IMU rate. Its state is defined on manifold as:
(1) |
where and are the velocity of IMU and gravity vector with reference to the global frame, and are the bias vectors of gyroscope and accelerometer of the IMU. The true state of the system can be divided into two states:
(2) | ||||
where is the error state, is the nominal state, and are encapsulated operators that represent a bijective mapping between the manifold and its local tangent space [23]; is the minimal representation of the error rotation. In this way, we can use the iterative error state Kalman filter (IESKF) to estimate the error state on Euclidean space and then project it back to the manifold using (2) for the maintenance of the true state .
IV Methodology

Given the estimate of the last scan: and , we will detail the system workflow in the following subsections.
IV-A IMU Process
Recalling the state defined in (1), we have the following discrete model based on continuous kinematics equations [23]:
(3) |
where , , and are the system input, system noise, and kinematics function respectively. Please refer to Appendix -A for the detailed formulations. Firstly, the mean of the nominal state will be predicted without considering the noise :
(4) |
Consequently, this will result in the accumulation of inaccuracies. To cope with the problem, is taken into account in the prediction of the error state :
(5) |
where and are the Jacobian matrices of w.r.t. and [4]. In the implementation, the system noise is considered as Gaussian white noise following the normal distribution , where is the covariance matrix. The error state is an independent random vector with multivariate normal distribution: . As a consequence, the predicted covariance matrix of can be calculated as the linear transformation:
(6) |
The IMU process module utilizes the state estimate from the previous scan as the initial condition and recursively predicts , and till the end time of the current LiDAR scan .
IV-B LiDAR Process
All the LiDAR points with timestamps are cached to form a point cloud . For most commercial LiDAR, the points are sampled under different timestamps, which will introduce motion distortion to the cloud . We seek to compensate for the distortion by projecting all the points in to , the LiDAR frame at . To this end, we adopt the backward propagation proposed in [3] to obtain the relative motion between IMU frames sampled at and : . The point can thus be projected to as:
(7) |
Moreover, a representative descriptor for the point is required for robust correspondence matching. Thus, we represent the deskewd points as a collection of Gaussian distributions. For each point , we collect its nearest neighbors and fit them to a Gaussian distribution [17]. Eventually, the LiDAR process module outputs a collection of the fitted Gaussian distributions for all the points: .
IV-C Iterated State Estimation
Upon completion of the IMU and LiDAR processes, the following steps for the IESKF-based LIO system involve correspondence matching, residual formulation, and Kalman update. However, existing filter-based LIO systems [3, 6, 5] typically adopt the nearest correspondence matching strategy and treat the ensuing point-to-feature distance as the residual. This strategy involves projecting to the global frame as:
(8) |
For long-term odometry, the inevitable accumulation of localization drifts introduces errors to . Simultaneously, the presence of spike noise, a common issue in low-cost IMUs, leads to inaccuracies in . These factors prevent from being projected to its exact corresponding feature, generating mismatches (see Fig. 3). Besides, matching the nearest feature in the map could be computationally intensive if the map size grows too large in long-term odometry. To cope with these issues, we propose a new residual formulation for IESKF, which can mitigate incorrectly matched correspondences while maintaining computational efficiency (see Fig. 3). The details are described in the following subsections.

IV-C1 Correspondence Matching

Suppose the iterated state estimation module is under the iteration and the corresponding state estimation is . We first project the fitted distributions into the global frame as:
(9) | ||||
For each projected distribution , we aim to find its near and similar correspondence in the global voxel map. Each voxel contains one Gaussian distribution (details of the global map is presented in Section IV-D). Hellinger distance is a statistical measure that quantifies the divergence between two probability distributions. Its value is dominated by two terms: a similarity term and a distance term. The similarity term captures the resemblance between two probability distributions, while the distance term quantifies the separation. For the correspondence evaluation, we extract the similarity term from the squared Hellinger distance [24] and denote it as the similarity metric :
(10) |
The larger the is, the more similar the two distributions are. The correspondence matching procedure can now be conceptualized as follows: We first find the voxel coordinate in the global map in which occupies. Then, we define a neighbor set of the occupied voxel as (the selection for is illustrated in Fig. 4). After that, we calculate the similarity between the source distribution and the target distribution in () and discard the pairs with smaller than a threshold . Empirically, the selection for is influenced by the noise level of the LiDAR and IMU: the larger the noise is, the smaller the should be selected. Generally, should not be selected in a strict manner, as the noise inherent in the LiDAR sampling and deskewing procedures could potentially have an adverse impact on similarity.
IV-C2 Residual Formulation
Before presenting the new residual metric, it is necessary to explain the observation model and state estimation scheme for IESKF. This will provide the rationale behind the proposed residual metric. The observation function of the system is:
(11) |
where is the measurement noise satisfying , is the observation function that maps the state space to the observation space. Since the error state is quite small, we can linearize the observation at :
(12) |
where is the Jacobian matrix of w.r.t. . As and lie in different tangent space of the manifold (the tangent space of and respectively), we need to project to in the tangent space as:
(13) |
where is the inverse of the Jacobian matrix of w.r.t. . Readers may refer to Appendix -A for the detailed formulation of . Combining (12) and (13), following the Bayesian rule, we have the maximum a posterior (MAP) problem:
(14) |
Given that the matched correspondences are Gaussian distribution pairs, we aim to design a residual metric such that the second term in (14) can represent both the distance and the variance disparity. Let’s consider the weighted Mahalanobis distance presented as follows:
(15) |
where is the number of matched correspondences, () is a constant diagonal matrix to avoid the singularity issue. The Mahalanobis distance is weighted by the similarity of the matched correspondences, ensuring that the pairs with higher similarity and closer distance contribute more to the estimation process, which aligns with our objective. Adopting this formula, the resulting residual metric for would be:
(16) |
Appendix -B exhibits the detailed derivation and definition.
IV-C3 State Update
IV-D Incremental Global Map
The global map is composed of voxels stored as a hash table: , which builds a map from the voxel coordinate to its centroid and the covariance matrix . This structure demonstrates significant efficiency in querying, achieving an optimal time complexity for correspondence matching. This structure is similar to that of VGICP [19]. In contrast to VGICP, which only supports pair-wise registration and requires repeated voxelization of the source scan, our approach offers a complete solution for insert, update, and delete operations within a global target map. We also address the variable LiDAR sampling rate problem (e.g. the ground voxels are undersampled in the distance and degrade from the ”plane” to ”line” feature) by introducing an update scheme to ensure the continuous and accurate maintenance of the map.
For the initialization, the first scan is regarded as the global frame . Each point in the scan is subsequently fitted to a Gaussian distribution as the procedure in IV-B to obtain . Consequently, the scan is voxelized with voxel size . Suppose we have the voxel with coordinate encompassing a set of points , the centroid and covariance matrix will be calculated as the average of the distributions within. Once the current estimate is acquired, the current scan contributes to the global map. Since the projected Gaussian distributions have been obtained during the state estimation phase, we proceed to voxelize the current scan following (18), resulting in a temporary voxel map . For each voxel present in this temporary voxel map, we query its coordinate in the global map. If the global map doesn’t contain the voxel, we directly insert the voxel into the global map. This operation is time-efficient since the insert operation for a hash table has a time complexity of . Otherwise, the global map will be updated as:
(18) | ||||
where is the recorded point number of voxel in the global map, is the corresponding one in the temporary voxel map. In this way, the global map only necessitates the maintenance of the centroid and covariance matrix for each voxel, which is considerably efficient in terms of memory usage and time efficiency compared to maintaining a tree structure storing all points [4] or embedding additional structures into the voxels [5, 16]. Specifically, is updated using the formula . This is motivated by the goal of maintaining a continuously updated voxel scheme. This approach ensures that is constrained by the maximum number of points that can be sampled by LiDAR within a voxel. Otherwise, recalling (18), an unbounded would lead to after a few scans, thus preventing the update of voxel parameters. Consequently, this enables an incremental update of the global map without the necessity to delete voxels from the map. Nevertheless, the mapping scheme also provides the option to remove voxels out of the effective range of the LiDAR for operation in exceptionally large environments.
V Experiment
In this section, the proposed system will be evaluated in terms of accuracy, temporal efficiency, and storage efficiency. All the experiments are conducted on a laptop with a 2.3 GHz AMD Ryzen 7 6800H CPU and 16 GB RAM. We chose 9 sequences from two representative datasets for the evaluation. The first 5 sequences, denoted as nc_*, are selected from Newer College dataset [26], with an OS1-64 LiDAR at a scan rate of 10Hz and a built-in 6-axis IMU sampled at 100Hz. The rest sequences are selected from the MCD VIRAL dataset [27], collected by an OS1-128 LiDAR sampled at 10Hz, a Livox Mid-70 LiDAR, and a VectorNav VN100 9-axis IMU sampled at 800 Hz. In order to evaluate the influence of IMU on the odometry accuracy, we perform experiments on different IMUs for the same sequences from MCD VIRAL dataset: ntu_* denoting OS1-128 LiDAR with VN100 IMU, ntuo_* denoting OS1-128 LiDAR with its built-in IMU. The ground truth poses for both datasets are generated by registering each LiDAR scan to a highly accurate prior map using an ICP method.
ntu_02 | ntu_04 | ntu_10 | ntu_13 | ntuo_02 | ntuo_04 | ntuo_10 | ntuo_13 | nc_01 | nc_02 | nc_05 | nc_06 | nc_07 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DLO | 6.46/0.43 | 3.65/0.11 | 7.65/0.51 | 3.98/0.20 | 6.46/0.43 | 3.75/0.11 | 7.65/0.51 | 3.98/0.20 | 4.20/0.51 | 4.27/0.37 | 3.44/0.44 | 1.31/0.62 | 1.73/0.18 |
VoxelMap | 1.83/0.11 | 2.98/0.18 | 8.48/0.57 | 1.83/0.11 | 2.98/0.18 | 8.48/0.57 | |||||||
LIO-SAM | 7.80/0.52 | 3.71/0.23 | 9.32/0.63 | 3.05/0.16 | -a | - | - | - | - | - | - | - | - |
FAST-LIO2 | 2.09/0.24 | 2.11/0.13 | 2.34/0.15 | 2.89/0.11 | 5.64/0.80 | 3.51/0.23 | 7.32/0.41 | 6.79/0.50 | 3.23/0.37 | 4.15/0.43 | 3.13/0.26 | 0.94/0.37 | 1.16/0.15 |
LIO-GVM(w/o s) | 2.37/0.40 | 2.35/0.11 | 3.21/0.14 | 2.79/0.14 | 5.51/0.68 | 4.73/0.39 | 6.59/0.39 | 4.58/0.26 | 3.36/0.31 | 4.76/0.44 | 3.46/0.47 | 1.28/0.43 | 1.38/0.20 |
LIO-GVM(plane) | 8.24/0.68 | 6.91/0.41 | 8.29/0.74 | 5.77/0.43 | 9.34/0.90 | 7.95/0.54 | 10.2/0.81 | 8.89/0.73 | 6.29/0.49 | 6.30/0.67 | 4.54/0.72 | 2.48/0.59 | 3.45/0.30 |
LIO-GVM | 1.74/0.46 | 2.26/0.05 | 3.16/0.11 | 2.74/0.09 | 3.23/0.40 | 2.69/0.14 | 4.25/0.27 | 3.60/0.18 | 2.47/0.26 | 3.20/0.35 | 3.27/0.33 | 0.89/0.33 | 1.10/0.17 |
-
•
Errors are denoted as ATE[%]/ARE[deg/10m] (smaller is better). We only align the position and orientation at the origin for a comprehensive evaluation.
-
•
Bold value stands for the best; underline is the second best; ’-’ denotes that the algorithm is not applicable; ’’ indicates that the algorithm diverges.
-
•
’-a’: LIO-SAM is not applicable for ntuo_* an nc_* sequences due to the lack of a 9-axis IMU.
V-A Evaluation of Odometry Accuracy
We evaluate the accuracy by comparing the KITTI metric [28]: the average translation error (ATE), and the average rotation error (ARE). We compare LIO-GVM with three state-of-the-art LiDAR (-inertial) odometry algorithms: DLO [29], VoxelMap [12], LIO-SAM [1], and FAST-LIO2 [4]. DLO is a GICP-based LO, VoxelMap is a probabilistic voxel-based LO while LIO-SAM and FAST-LIO2 are LIO systems. Parameters for all methods are kept constant across sequences for consistency. For LIO-GVM, the voxel size, , and are empirically set as 1.0 , 0.70 and , respectively. Additionally, the neighbor searching scheme is set as 7-neighbors searching.


Two variations of the proposed system denoted as LIO-GVM(w/o s) and LIO-GVM(plane) are tested for ablation study. LIO-GVM(w/o s) is implemented without the outlier rejecting scheme proposed in Sec. IV-C1. LIO-GVM(plane) replaces the proposed new residual metric with the point-to-plane distance [4]. As shown in Tab. II, both LIO-GVM(w/o s) and LIO-GVM(plane) turn worse compared to the original system for all the sequences, indicating that the proposed correspondence matching scheme and residual metric both contribute to the accuracy of the system. The accuracy of LIO-GVM(plane) tends to be much worse than LIO-GVM(w/o s), which is because the residual metric inherently reduces the weights for mismatches. The drastic accuracy degradation of LIO-GVM(plane) is because, without the proposed residual metric, all the adopted correspondences contribute equally to the estimation, even those with low similarity.
As illustrated in Tab. II, LIO-GVM achieves the best results for most sequences due to its ability to reject mismatched distributions (the proposed correspondence matching) and evaluate the variance divergence in the distribution covariance (the proposed residual metric). For ntu_* sequences, FAST-LIO2 achieves comparable results with LIO-GVM, because the accurate IMU integration can approximately project source points to the correct feature in the target map. VoxelMap offers accurate pose estimation in some sequences but frequently diverges, especially for the nc_* sequences equipped with a 64-line LiDAR. This divergence is attributed to its map update scheme. When the LiDAR remains stationary for several scans, distant voxels converge prematurely and stay undersampled. This issue is particularly obvious for ground voxels, leading to the majority of divergence occurrences along the z-axis. Conversely, LIO-GVM employs a continuously updated mapping scheme, ensuring that the performance remains stable for motion and LiDAR sampling rate variations.
The performance of all LIO systems is negatively impacted in ntuo_* compared to ntu_* sequences due to the equipment of a low-cost IMU. Nevertheless, LIO-GVM is able to mitigate the impact of IMU noises through its proposed correspondence matching scheme, which can reject outliers. Additionally, LIO-GVM incorporates a new residual metric that takes similarity into account during the estimation phase, allowing more effective correspondences to have a greater influence on the estimation. Consequently, LIO-GVM exhibits the best accuracy compared to other systems. However, all the methods have a common limitation in accommodating the pronounced changes in elevation in the ntu_* sequences, as seen in Fig. 5 and Fig. 6.
Overall, LIO-GVM has demonstrated the best performance regarding odometry accuracy, showing the effectiveness of the error-rejecting correspondence matching scheme and the distribution-based residual metric in achieving accurate pose estimation across various sensor setups, especially in long-term applications or with cheaper IMUs.
Gaussian Fitting | Correspondence Match | Incremental Update | Total Map Time | |||||
---|---|---|---|---|---|---|---|---|
ikd-Tree | GVM | ikd-Tree | GVM | ikd-Tree | GVM | ikd-Tree | GVM | |
ntu_02 | 0 | 11.61 | 25.05 | 1.29 | 2.45 | 5.13 | 27.50 | 18.03 |
ntu_04 | 0 | 12.91 | 24.26 | 1.41 | 3.96 | 5.34 | 28.22 | 19.66 |
ntu_10 | 0 | 11.86 | 24.27 | 1.34 | 3.97 | 5.87 | 28.24 | 19.08 |
ntu_13 | 0 | 12.71 | 24.89 | 1.67 | 3.62 | 5.18 | 28.51 | 19.56 |
nc_01 | 0 | 8.10 | 14.36 | 0.69 | 0.78 | 2.97 | 15.14 | 11.76 |
nc_02 | 0 | 8.25 | 15.76 | 0.73 | 0.92 | 2.84 | 16.68 | 11.82 |
nc_05 | 0 | 7.73 | 12.62 | 0.67 | 0.57 | 2.84 | 13.19 | 11.24 |
nc_06 | 0 | 7.19 | 13.66 | 0.70 | 0.81 | 3.23 | 14.47 | 11.12 |
nc_07 | 0 | 8.99 | 15.67 | 0.62 | 1.11 | 3.18 | 16.78 | 12.79 |
-
•
We evaluate the average time consumed per scan [ms].
-
•
Bold value stands for the best value.


V-B Evaluation of Storage Efficiency
In this section, we evaluate the storage consumption of the global map generated by LIO-GVM. Specifically, in LIO-GVM, we have defined a custom point type using the Point Cloud Library (PCL). Each point encapsulates all the elements of , comprising a total of four integer values and nine floating-point values. All the voxels are stored in a compressed binary .pcd file. For other systems, we configure their local map size to be 500m, ensuring coverage of all areas within the two datasets, and save all the points in the map to a compressed binary .pcd file.

As illustrated in Tab. IV, the map storage consumption of LIO-GVM outperforms other systems for all the sequences. This is because other systems need to ensure that the downsampling filter of the map is not excessively large (usually 0.5m) to preserve the geometric information of the points. In contrast, our approach allows us to allocate a considerably larger voxel size (commonly 1m) compared to other systems while still offering a comprehensive representation of the intra-voxel point distribution.
ntu_02 | ntu_04 | ntu_10 | ntu_13 | nc_01 | nc_02 | nc_05 | nc_06 | nc_07 | |
---|---|---|---|---|---|---|---|---|---|
FAST-LIO2 | 6.07 | 21.83 | 19.13 | 22.19 | 7.93 | 9.24 | 1.15 | 0.65 | 5.62 |
LIO-GVM | 3.55 | 12.61 | 11.39 | 8.29 | 4.25 | 4.60 | 0.13 | 0.48 | 3.01 |
-
•
The map storage is analyzed by evaluating the storage consumption [MB]. All the maps are saved as binary compressed PCD files.
-
•
Bold value stands for the best value.
V-C Evaluation of Time Consumption
In this section, we will evaluate the temporal efficiency of LIO-GVM, utilizing the same parameters as those deployed in our previous odometry evaluations for consistency. We first analyze the influence of the data structure on the mapping procedure, which involves correspondence matching and incremental updates. Given the superior performance of the ikd-Tree provided in [4] compared to other dynamic tree-based structures, our comparison will be limited to this structure. While the Gaussian fitting process does not directly participate in the mapping procedure, its time consumption is still accounted for in LIO-GVM for a fair comparison. We denote the data structure in LIO-GVM for the mapping procedure as GVM. The results are illustrated in Tab. III. GVM significantly outperforms ikd-Tree in correspondence matching because the time complexity of -nearest neighbors searching for ikd-Tree is , where is the tree size; but for GVM, managed in a hash table, the corresponding time complexity is . ikd-Tree slightly performs better in the incremental update. Nonetheless, GVM demonstrates superior performance in total map time.
In Fig. 7, we illustrate the average total time consumed per scan for all the sequences. The total time we evaluated includes all the procedures: data processing, pose estimation, and map update. Since LIO-SAM and DLO run map update in different thread with the pose estimation thread, we only evaluate their time consumed for state estimation. Notably, VoxelMap consumes an average of 141.3 ms per scan due to its demand for dense point cloud data. Even though the mapping procedure of LIO-GVM is faster than FAST-LIO2, FAST-LIO2 achieves the shortest total time for almost all the sequences. This is because LIO-GVM requires more geometric information from the input scan. For example, in ntu_* sequences, FAST-LIO2 employs fewer feature points than LIO-GVM (4638 versus 5741). Consequently, LIO-GVM spends more time on pose estimation. However, as illustrated in Tab. 8, the fast performance of FAST-LIO2 and DLO comes with the trade-off of increased runtime RAM usage, while LIO-GVM can achieve comparable temporal efficiency with lower RAM usage.
VI Conclusion
In conclusion, this letter proposes an accurate filter-based LIO system denoted as LIO-GVM, which models the points as Gaussian distributions for robust correspondence matching. The key innovation of the system is the proposed new residual metric. Unlike previous methods that solely measure distance, our approach incorporates variance disparity, thereby enhancing the system’s performance. Benefiting from the new metric, the map of LIO-GVM can be maintained in a simple yet efficient manner. We demonstrate the reliability and efficiency of LIO-GVM through extensive experiments on different datasets. The limitation of LIO-GVM is that it requires equipment with higher resolution LiDAR to ensure informative Gaussian fitting of the input scan. Additionally, the voxel size is not adaptive to the environment. To address these limitations, future work will focus on developing an adaptive voxel representation scheme for the global map.
-A Supplemental Material
Please find details in the supplemental material: supplemental_material.pdf.
-B Derivation of the Proposed Residual Metric
Let’s denote as , then equate (15) with the second term in MAP problem, we’ll have:
(19) |
To avoid the singularity issue, we first perform eigendecomposition on to have . Subsequently, the eigenvalues are normalized and clamped at : . In order to obtain the same formula as the left-hand side, we perform matrix decomposition to have , which is quite easy since is a diagonal matrix. Thus,
(20) | ||||
where:
(21) | ||||
References
- [1] T. Shan, B. Englot, D. Meyers, W. Wang, C. Ratti, and D. Rus, “Lio-sam: Tightly-coupled lidar inertial odometry via smoothing and mapping,” in 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2020, pp. 5135–5142.
- [2] H. Ye, Y. Chen, and M. Liu, “Tightly coupled 3d lidar inertial odometry and mapping,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 3144–3150.
- [3] W. Xu and F. Zhang, “Fast-lio: A fast, robust lidar-inertial odometry package by tightly-coupled iterated kalman filter,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 3317–3324, 2021.
- [4] W. Xu, Y. Cai, D. He, J. Lin, and F. Zhang, “Fast-lio2: Fast direct lidar-inertial odometry,” IEEE Transactions on Robotics, vol. 38, no. 4, pp. 2053–2073, 2022.
- [5] C. Bai, T. Xiao, Y. Chen, H. Wang, F. Zhang, and X. Gao, “Faster-lio: Lightweight tightly coupled lidar-inertial odometry using parallel sparse incremental voxels,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4861–4868, 2022.
- [6] C. Qin, H. Ye, C. E. Pranata, J. Han, S. Zhang, and M. Liu, “Lins: A lidar-inertial state estimator for robust and efficient navigation,” in 2020 IEEE international conference on robotics and automation (ICRA). IEEE, 2020, pp. 8899–8906.
- [7] S. A. Ludwig, “Investigation of orientation estimation of multiple imus,” Unmanned Systems, vol. 9, no. 04, pp. 283–291, 2021.
- [8] S. Yuan, H. Wang, and L. Xie, “Survey on localization systems and algorithms for unmanned systems,” Unmanned Systems, vol. 9, no. 02, pp. 129–163, 2021.
- [9] J. Zhang and S. Singh, “Loam: Lidar odometry and mapping in real-time.” in Robotics: Science and Systems, vol. 2, no. 9. Berkeley, CA, 2014, pp. 1–9.
- [10] T. Shan and B. Englot, “Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 4758–4765.
- [11] H. Wang, C. Wang, C.-L. Chen, and L. Xie, “F-loam: Fast lidar odometry and mapping,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 4390–4396.
- [12] C. Yuan, W. Xu, X. Liu, X. Hong, and F. Zhang, “Efficient and probabilistic adaptive voxel mapping for accurate online lidar odometry,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 8518–8525, 2022.
- [13] J. Behley and C. Stachniss, “Efficient surfel-based slam using 3d laser range data in urban environments.” in Robotics: Science and Systems, vol. 2018, 2018, p. 59.
- [14] T.-M. Nguyen, D. Duberg, P. Jensfelt, S. Yuan, and L. Xie, “Slict: Multi-input multi-scale surfel-based lidar-inertial continuous-time odometry and mapping,” IEEE Robotics and Automation Letters, vol. 8, no. 4, pp. 2102–2109, 2023.
- [15] C. Park, P. Moghadam, J. L. Williams, S. Kim, S. Sridharan, and C. Fookes, “Elasticity meets continuous-time: Map-centric dense 3d lidar slam,” IEEE Transactions on Robotics, vol. 38, no. 2, pp. 978–997, 2021.
- [16] S. Zhao, H. Zhang, P. Wang, L. Nogueira, and S. Scherer, “Super odometry: Imu-centric lidar-visual-inertial estimator for challenging environments,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 8729–8736.
- [17] P. Biber and W. Straßer, “The normal distributions transform: A new approach to laser scan matching,” in Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003)(Cat. No. 03CH37453), vol. 3. IEEE, 2003, pp. 2743–2748.
- [18] A. Segal, D. Haehnel, and S. Thrun, “Generalized-icp.” in Robotics: science and systems, vol. 2, no. 4. Seattle, WA, 2009, p. 435.
- [19] K. Koide, M. Yokozuka, S. Oishi, and A. Banno, “Voxelized gicp for fast and accurate 3d point cloud registration,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 11 054–11 059.
- [20] T. Stoyanov, M. Magnusson, H. Andreasson, and A. J. Lilienthal, “Fast and accurate scan registration through minimization of the distance between compact 3d ndt representations,” The International Journal of Robotics Research, vol. 31, no. 12, pp. 1377–1393, 2012.
- [21] M. Yokozuka, K. Koide, S. Oishi, and A. Banno, “Litamin2: Ultra light lidar-based slam using geometric approximation applied with kl-divergence,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 11 619–11 625.
- [22] Y. Cai, W. Xu, and F. Zhang, “ikd-tree: An incremental kd tree for robotic applications,” arXiv preprint arXiv:2102.10808, 2021.
- [23] C. Hertzberg, R. Wagner, U. Frese, and L. Schröder, “Integrating generic sensor fusion algorithms with sound state representations through encapsulation of manifolds,” Information Fusion, vol. 14, no. 1, pp. 57–77, 2013.
- [24] A. L. Gibbs and F. E. Su, “On choosing and bounding probability metrics,” International statistical review, vol. 70, no. 3, pp. 419–435, 2002.
- [25] D. He, W. Xu, and F. Zhang, “Kalman filters on differentiable manifolds,” arXiv preprint arXiv:2102.03804, 2021.
- [26] M. Ramezani, Y. Wang, M. Camurri, D. Wisth, M. Mattamala, and M. Fallon, “The newer college dataset: Handheld lidar, inertial and vision with ground truth,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 4353–4360.
- [27] Anonymous, “Mcd viral: Visual-inertial-range-lidar multi-campus dataset with continuous-time ground truth,” 11 2023. [Online]. Available: https://mcdviral.github.io/
- [28] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012, pp. 3354–3361.
- [29] K. Chen, B. T. Lopez, A.-a. Agha-mohammadi, and A. Mehta, “Direct lidar odometry: Fast localization with dense point clouds,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2000–2007, 2022.