Visual odometry makes it possible to calculate the motion of a camera system from a sequence of images. In the case of a stereo system, the rotation about all three axes, as well as the translation along the axes and, consequently, the motion in all six degrees of freedom can be reconstructed. This information can be used as an estimate of the current driving states for different driver assistance systems such as e.g. anti-lock braking system (ABS), electronic stability control (ESC), cruise control or roll stability control (RSC).
Nowadays, a combination of different sensors is used for the self-localization of vehicles. These include, for example, rotational speed sensors, acceleration sensors, odometry and GPS systems.
In order to increase the accuracy of the determination of the vehicle's own position—but in particular also the availability—in critical scenarios such as street canyons or tunnel scenarios, camera-based egomotion estimates are very important as an alternative sensor concept, in order to extend the previous sensor concept with an additional and independent sensor source.
WO 2013/037840 A1 discloses a method for determining position data of a vehicle. Driving dynamics data of the vehicle are measured. Position data of the vehicle are measured with an environment sensor which acquires the position data on the basis of at least one distance from a stationary object arranged with respect to the vehicle. Finally, position data of the vehicle are determined on the basis of the driving dynamics data and the position data. Here, a vehicle camera can be used as an environment sensor.
WO 2010/099789 A1 discloses a method for automatically detecting a driving maneuver of a motor vehicle, in particular a passing maneuver or an evasive maneuver. In this case, data from the vehicle sensor technology and lane marker detection based on data from the video sensor technology can be fused in the odometry. The odometry makes it possible to estimate the position, speed and orientation of the vehicle on the lane as well as additional state variables. These estimated variables can be made available to maneuver detection, other situation analysis algorithms or for control tasks.
In addition, the information from a camera-based egomotion estimate can be used as input for any dead reckoning approach to estimate the trajectory and location of the egovehicle. With such approaches, the own location can be continuously estimated, taking account of a starting position and the subsequent course of the motion and speed (translation and rotation).
Due to the high level of accuracy that can be achieved, the information can furthermore be combined for each time step and can be used to reconstruct the trajectory or respectively for localizing the vehicle—for example as a source of information for risk assessments, trajectory planning, car2X exchange of information right up to autonomous driving. These data can be used, for example, to supplement an existing GPS system and to support it in critical situations. The need for this is demonstrated, for example, by investigations, according to which GPS availability is only 30% in Hong Kong or Calgary, for example.
A value of only 10% availability has even been established for construction site scenarios, which are very similar to street canyons.
Visual odometry can provide considerable support in precisely these scenarios since, in contrast to GPS receivers, it is not reliant on external signals. A distinction can be made between visual odometry systems, for example, based on the time window of their calculation. In addition to calculating motions between two consecutive time steps, additional information can be used from less recent time steps. Here, a distinction can be made between Simultaneous Localization and Mapping (SLAM) and Bundle Adjustment. Whereas in the case of SLAM, mapping is also carried out in addition to estimating egomotion, the aim of Bundle Adjustment is to improve the egomotion estimate by subsequently optimizing the triangulated spatial points.
Challenging situations for camera-based systems include, in particular, an insufficient number of correspondences of static scene points, heavily changing illumination and low brightness, an unstructured environment with homogeneous, non-textured surfaces, or an improperly low frame rate. High-speed scenarios such as, for example, along freeways or country roads combine several of these problems, making them one of the most challenging situations. Especially, the loss of or respectively the significantly lower number of suitable near features (correspondences) complicates the estimation of the egomotion.
The essential part of any visual odometry system is the detection of outliers. A variety of methods for detecting outliers are known. Purely flow-based approaches are based on the assumption that the optical flow follows patterns which are induced by the egomotion of the vehicle. Furthermore, model-based approaches exist, which explicitly constrain the flow using a certain motion model. Many known methods use reprojection error-based approaches.
B. Kitt et al. demonstrate, for example, in the publication Visual odometry based on stereo image sequences with ransac-based outlier rejection scheme, IEEE Intelligent Vehicles Symposium, 2010, the calculation of a feature-based reprojection error for removing outliers, which is compared with a constant threshold.
In the case of RANSAC methods, a minimum number of correspondences selected at random is used, in each iteration, to create a motion hypothesis. Then, a score for each feature is calculated that describes whether the feature supports the motion hypothesis. If the motion hypothesis achieves sufficient support from the features, the non-supporting features are rejected as outliers. Otherwise, a minimum number of correspondences is selected at random again.
Alternative methods can be bracketed together as “MASOR” methods (MAximum Subset Outlier Removal). Here, the maximum number of features is taken to calculate a motion hypothesis. This motion hypothesis and a subsequent outlier removal step are repeated in an iterative scheme. Then a support score is calculated for every feature. Instead of judging the motion hypothesis, the support score is interpreted as a measure for the quality of a feature, as the hypothesis is considered to be a good estimate. Non-supporting features are rejected and the next iteration starts with the remaining features. This process is repeated until a termination criterion is met.