The recent progress of robotics and computing hardware have increased the demand for online metric map reconstruction from cameras. At the same time the scale of metric maps has been increased by two to three orders of magnitude. This poses a significant challenge for current state of the art camera based large scale modeling approaches. One of the most demanding applications of vision based map reconstruction is in robotics. Robots inherently need to model surround environment to safely navigate in a space while performing the various tasks.
Traditionally laser range finders (LIDAR) have been used in this task mainly because they directly measure the distance to a surface of a space visited by a robot with high precision. However there are significant limitations in this type of sensors. The major limitation is that typical LIDAR sensors only scan a 2D slice of the space and the slice needs to be in the same plane for an online simultaneous localization and mapping (SLAM) system to work. This limits the use of laser-based SLAM systems in an environment having objects with complex height profile (such as tables or shelves) for a robot to move freely in a 3D space. Moreover LIDAR sensors require highly accurate tracking on mobile platforms when moving. Another issue with the sensor is its size, weight and power consumption, which are significantly larger than passive sensors like video cameras.
In SLAM systems, the most difficult problem is to maintain an environment map (i.e., the perceived model of the environment) consistent to all observations, especially when loops exist in the motion trajectory of a robot. Existing SLAM solutions to the problem use bundle adjustment, which scales cubically with the problem size, thus prohibiting online computation in large scale environments. Bundle adjustment parameterizes structure from motion as an optimization problem, which characterizes each camera with six degrees of freedom (DOF) for the translation and rotation of the camera and plus parameters for the camera calibration and radial distortion. Additionally, the 3D points are parameterized through their three position parameters. The projection equations are used to derive a non-linear set of equations which are linearized through a Taylor series and solved efficiently through a sparse solver.
Large scale reconstructions of environment maps are challenging since the complexity of the bundle adjustment is at least cubic in the number of cameras plus a linear complexity in the number of points. Topological mapping can be used for online computation in large scale environments. Topological mapping represents the environment as a graph with a set of places (nodes) and the relative location information between the places (edges). In this representation, a loop closure does not require any additional error adjustment. However, in return, it loses the global metric property. For example, a robot cannot perform spatial reasoning for proximity unless the link between the map locations is present in the topological map.
The figures depict various embodiments of the invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.