Autonomous drone racing

In this video, we showcase aggressive autonomous drone races.

The drone is equipped with a pair of stereo cameras and a DJI N3 flight controller. All computing during the flight is done onboard. Our system consists of visual-inertial SLAM with loop closure, global mapping, local mapping, global trajectory optimization, local re-planning, and human-drone interaction interfaces.

Our system is built upon on a teach-and-repeat framework. A dense and globally consistent map is built before each experiment. In the teaching phase, the drone is piloted to provide a topological path (i.e. where hula hoop the drone should go through). In the repeating/execution phase, the drone converts the teaching path into a topologically equivalent optimal trajectory based on the known global obstacle map. The drone then executes the trajectory with user-expected velocity. The execution velocity can differ from that in the teaching phase. In fact, the drone operates much more aggressively during the execution phase thanks to the optimal trajectory generation. In the execution phase, state estimation and mapping functions maintain, such that the drone can avoid any new obstacles not identified in the global maps. Vision-based loop closure guarantees that the drone can operate in the same coordinate system for both the teaching and the repeating phase.

We target for better performance beyond human in challenging drone racing scenarios. Four video clips are presented to showcase the performance in indoor and outdoor, static and dynamic environments:
1. Indoor autonomous drone racing in a static environment
2. Indoor autonomous drone racing in an environment with unknown obstacles
3. Outdoor autonomous drone racing, trial 1
4. Outdoor autonomous drone racing, trial 2

Tong Qin wins IROS 2018 Best Student Paper Award

On October 4th 2018, the paper "Online temporal calibration for monocular visual-inertial systems" by Ph.D. student Tong Qin wins the Best Student Paper Award in the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018) at Madrid, Spain.

In this paper, we propose an online approach for calibrating temporal offset between visual and inertial measurements. Our approach achieves temporal offset calibration by jointly optimizing time offset, camera and IMU states, as well as feature locations in a SLAM system. Furthermore, the approach is a general model, which can be easily employed in several feature-based optimization frameworks. Simulation and experimental results demonstrate the high accuracy of our calibration approach even compared with other state-of-art offline tools. The VIO comparison against other methods proves that the online temporal calibration significantly benefits visual-inertial systems. The source code of temporal calibration is integrated into our public project, VINS-Mono.

IROS 2018

Visual-Inertial State Estimation

Monocular visual-inertial state estimation with online initialization and camera-IMU extrinsic calibration

By Zhenfei YANG

There have been increasing demands for developing microaerial vehicles with vision-based autonomy for search and rescue missions in complex environments. In particular, the monocular visual-inertial system (VINS), which consists of only an inertial measurement unit (IMU) and a camera, forms a great lightweight sensor suite due to its low weight and small footprint. In this paper, we address two challenges for rapid deployment of monocular VINS: 1) the initialization problem and 2) the calibration problem. We propose a methodology that is able to initialize velocity, gravity, visual scale, and camera-IMU extrinsic calibration on the fly. Our approach operates in natural environments and does not use any artificial markers. It also does not require any prior knowledge about the mechanical configuration of the system. It is a significant step toward plug-and-play and highly customizable visual navigation for mobile robots. We show through online experiments that our method leads to accurate calibration of camera-IMU transformation, with errors less than 0.02 m in translation and 1° in rotation. We compare out method with a state-of-the-art marker-based offline calibration method and show superior results. We also demonstrate the performance of the proposed approach in large-scale indoor and outdoor experiments.


Self-calibrating multi-camera visual-inertial fusion for autonomous MAVs

By Zhenfei YANG

We address the important problem of achieving robust and easy-to-deploy visual state estimation for micro aerial vehicles (MAVs) operating in complex environments. We use a sensor suite consisting of multiple cameras and an IMU to maximize perceptual awareness of the surroundings and provide sufficient redundancy against sensor failures. Our approach starts with an online initialization procedure that simultaneously estimates the transformation between each camera and the IMU, as well as the initial velocity and attitude of the platform, without any prior knowledge about the mechanical configuration of the sensor suite. Based on the initial calibrations, a tightly-coupled, optimization-based, generalized multi-camera-inertial fusion method runs onboard the MAV with online camera-IMU calibration refinement and identification of sensor failures. Our approach dynamically configures the system into monocular, stereo, or other multicamera visual-inertial settings, with their respective perceptual advantages, based on the availability of visual measurements. We show that even under random camera failures, our method can be used for feedback control of the MAVs. We highlight our approach in challenging indoor-outdoor navigation tasks with large variations in vehicle height and speed, scene depth, and illumination.


 Aggressive quadrotor flight using dense visual-inertial fusion

By Yonggen LING

In this work, we address the problem of aggressive flight of a quadrotor aerial vehicle using cameras and IMUs as the only sensing modalities. We present a fully integrated quadrotor system and demonstrate through online experiment the capability of autonomous flight with linear velocities up to 4.2 m/s, linear accelerations up to 9.6 m/s2 , and angular velocities up to 245.1 degree/s. Central to our approach is a dense visual-inertial state estimator for reliable tracking of aggressive motions. An uncertainty-aware direct dense visual tracking module provides camera pose tracking that takes inverse depth uncertainty into account and is resistant to motion blur. Measurements from IMU pre-integration and multi-constrained dense visual tracking are fused probabilistically using an optimization-based sensor fusion framework. Extensive statistical analysis and comparison are presented to verify the performance of the proposed approach. We also release our code as open-source ROS packages.


High altitude monocular visual-inertial state estimation: initialization and sensor fusion

By Tianbo LIU

Obtaining reliable state estimates at high altitude but GPS-denied environments, such as between high-rise buildings or in the middle of deep canyons, is known to be challenging, due to the lack of direct distance measurements. Monocular visual-inertial systems provide a possible way to recover the metric distance through proper integration of visual and inertial measurements. However, the nonlinear optimization problem for state estimation suffers from poor numerical conditioning or even degeneration, due to difficulties in obtaining observations of visual features with sufficient parallax, and the excessive period of inertial measurement integration. Here we propose a spline-based high altitude estimator initialization method for monocular visual-inertial navigation system (VINS) with special attention to the numerical issues. Our formulation takes only inertial measurements that contain sufficient excitation, and drops uninformative measurements such as those obtained during hovering. In addition, our method explicitly reduces the number of parameters to be estimated in order to achieve earlier convergence. Based on the initialization results, a complete closed-loop system is constructed for high altitude navigation. Extensive experiments are conducted to validate our approach.


Robust initialization of monocular visual-inertial estimation on aerial robots

By Tong QIN

We propose a robust on-the-fly estimator initialization algorithm to provide high-quality initial states for monocular visual-inertial systems (VINS). Due to the non-linearity of VINS, a poor initialization can severely impact the performance of either filtering-based or graph-based methods. Our approach starts with a vision-only structure from motion (SfM) to build the up-to-scale structure of camera poses and feature positions. By loosely aligning this structure with pre-integrated IMU measurements, our approach recovers the metric scale, velocity, gravity vector, and gyroscope bias, which are treated as initial values to bootstrap the nonlinear tightly-coupled optimization framework. We highlight that our approach can perform on-the-fly initialization in various scenarios without using any prior information about system states and movement. The performance of the proposed approach is verified through the public UAV dataset and real-time onboard experiment. We make our implementation open source, which is the initialization part integrated in the VINS-Mono.


Invited Talk:ICRA 2016 Tutorial

Professor Shaojie SHEN offered a tutorial lecture with regard to Sensor fusion for State Estimation, for 2016 ICRA IEEE International Conference on Robotics and Automation which hold at Stockholm, Sweden. This tutorial provides an introduction to the theory and practice of aerial robots, with a mix of fundamentals and application. It will expose participants to the state of the art in robot design, mechanics, control, estimation, perception and planning. The tutorial aimed to answer the question of “What sensors besides cameras are required for state estimation”, “How do we fuse information from different cameras and ensure redundancy ”And “how do we calibrate different systems”.

Suggested readings:
• Z. Yang and S. Shen, “Monocular Visual-Inertial State Estimation with Online Initialization and Camera-IMU Extrinsic Calibration”, IEEE Transactions on Automation Science and Engineering, 2016
• S. Shen, Y. Mulgaonkar, N. Michael, and V. Kumar, “Multi-Sensor Fusion for Robust Autonomous Flight in Indoor and Outdoor Environments with a Rotorcraft MAV”, IEEE International Conference on Robotics and Automation, Hong Kong, China, May 2014


Related links

ICRA 2016 in Stockholm

ICRA 2016 Tutorial: Aerial Robotics

Presentation slide: Sensor Fusion for Aerial Robots