Monocular visual-inertial state estimation with online initialization and camera-IMU extrinsic calibration

By Zhenfei YANG

There have been increasing demands for developing microaerial vehicles with vision-based autonomy for search and rescue missions in complex environments. In particular, the monocular visual-inertial system (VINS), which consists of only an inertial measurement unit (IMU) and a camera, forms a great lightweight sensor suite due to its low weight and small footprint. In this paper, we address two challenges for rapid deployment of monocular VINS: 1) the initialization problem and 2) the calibration problem. We propose a methodology that is able to initialize velocity, gravity, visual scale, and camera-IMU extrinsic calibration on the fly. Our approach operates in natural environments and does not use any artificial markers. It also does not require any prior knowledge about the mechanical configuration of the system. It is a significant step toward plug-and-play and highly customizable visual navigation for mobile robots. We show through online experiments that our method leads to accurate calibration of camera-IMU transformation, with errors less than 0.02 m in translation and 1° in rotation. We compare out method with a state-of-the-art marker-based offline calibration method and show superior results. We also demonstrate the performance of the proposed approach in large-scale indoor and outdoor experiments.


Self-calibrating multi-camera visual-inertial fusion for autonomous MAVs

By Zhenfei YANG

We address the important problem of achieving robust and easy-to-deploy visual state estimation for micro aerial vehicles (MAVs) operating in complex environments. We use a sensor suite consisting of multiple cameras and an IMU to maximize perceptual awareness of the surroundings and provide sufficient redundancy against sensor failures. Our approach starts with an online initialization procedure that simultaneously estimates the transformation between each camera and the IMU, as well as the initial velocity and attitude of the platform, without any prior knowledge about the mechanical configuration of the sensor suite. Based on the initial calibrations, a tightly-coupled, optimization-based, generalized multi-camera-inertial fusion method runs onboard the MAV with online camera-IMU calibration refinement and identification of sensor failures. Our approach dynamically configures the system into monocular, stereo, or other multicamera visual-inertial settings, with their respective perceptual advantages, based on the availability of visual measurements. We show that even under random camera failures, our method can be used for feedback control of the MAVs. We highlight our approach in challenging indoor-outdoor navigation tasks with large variations in vehicle height and speed, scene depth, and illumination.


 Aggressive quadrotor flight using dense visual-inertial fusion

By Yonggen LING

In this work, we address the problem of aggressive flight of a quadrotor aerial vehicle using cameras and IMUs as the only sensing modalities. We present a fully integrated quadrotor system and demonstrate through online experiment the capability of autonomous flight with linear velocities up to 4.2 m/s, linear accelerations up to 9.6 m/s2 , and angular velocities up to 245.1 degree/s. Central to our approach is a dense visual-inertial state estimator for reliable tracking of aggressive motions. An uncertainty-aware direct dense visual tracking module provides camera pose tracking that takes inverse depth uncertainty into account and is resistant to motion blur. Measurements from IMU pre-integration and multi-constrained dense visual tracking are fused probabilistically using an optimization-based sensor fusion framework. Extensive statistical analysis and comparison are presented to verify the performance of the proposed approach. We also release our code as open-source ROS packages.


High altitude monocular visual-inertial state estimation: initialization and sensor fusion

By Tianbo LIU

Obtaining reliable state estimates at high altitude but GPS-denied environments, such as between high-rise buildings or in the middle of deep canyons, is known to be challenging, due to the lack of direct distance measurements. Monocular visual-inertial systems provide a possible way to recover the metric distance through proper integration of visual and inertial measurements. However, the nonlinear optimization problem for state estimation suffers from poor numerical conditioning or even degeneration, due to difficulties in obtaining observations of visual features with sufficient parallax, and the excessive period of inertial measurement integration. Here we propose a spline-based high altitude estimator initialization method for monocular visual-inertial navigation system (VINS) with special attention to the numerical issues. Our formulation takes only inertial measurements that contain sufficient excitation, and drops uninformative measurements such as those obtained during hovering. In addition, our method explicitly reduces the number of parameters to be estimated in order to achieve earlier convergence. Based on the initialization results, a complete closed-loop system is constructed for high altitude navigation. Extensive experiments are conducted to validate our approach.


Robust initialization of monocular visual-inertial estimation on aerial robots

By Tong QIN

We propose a robust on-the-fly estimator initialization algorithm to provide high-quality initial states for monocular visual-inertial systems (VINS). Due to the non-linearity of VINS, a poor initialization can severely impact the performance of either filtering-based or graph-based methods. Our approach starts with a vision-only structure from motion (SfM) to build the up-to-scale structure of camera poses and feature positions. By loosely aligning this structure with pre-integrated IMU measurements, our approach recovers the metric scale, velocity, gravity vector, and gyroscope bias, which are treated as initial values to bootstrap the nonlinear tightly-coupled optimization framework. We highlight that our approach can perform on-the-fly initialization in various scenarios without using any prior information about system states and movement. The performance of the proposed approach is verified through the public UAV dataset and real-time onboard experiment. We make our implementation open source, which is the initialization part integrated in the VINS-Mono.