Field
The invention relates to a method for analyzing related images, a corresponding image processing system, a vehicle comprising such a system and a corresponding computer program product. In particular the analysis of the related images relates to depth estimation from stereo images which is preferably used in vehicles that are equipped with driver assistance systems taking advantage of image processing, or autonomous vehicles.
Description of the Related Art
Over the last years a large progress has been made in the field of image processing. The improved analysis results therefore can be used to produce and output information of a real situation in which an image or a sequence of images was captured. Since the result reflects a particular aspect of the real situation it can be used either to assist a person to be aware of this situation and act accordingly or even to assist in performing a necessary action directly. Such systems are known as Advance Driver Assistance Systems (ADAS) for example. Here images of a traffic situation are captured and an analysis of the captured images is performed in order to decelerate or accelerate a vehicle or to alarm the driver when an imminent collision might be observed.
Many other applications might be considered as well, of course. A prominent example for use of image processing systems as well are autonomous devices which become more and more popular to increase the comfort of its owner and/or safety of its user. Such autonomous devices can be autonomous lawn mowers, autonomous cleaning machines that might be used in contaminated areas without the need of an operator in dangerous areas and the like. Nevertheless such autonomous devices still need additional sensors like radar or sonar sensors, bump sensors or the like, because depth estimation is still problematic. For the bump sensors it is a particular disadvantage that in some cases they will not lead to the desired change of driving direction. When for example an obstacle like a branch of a tree that is too high for being hit by the bump sensor is approached the collision cannot be avoided.
All of these vehicles have the problem that using conventional equipment such as cameras only 2D-images can be captured from the real 3D environment. In order to overcome this problem stereoscopic cameras have been developed. With these stereoscopic cameras two images are captured. These two images are captured at the same time but from different locations, because the stereoscopic cameras have two image capturing units, each having its own lens and image sensor. Naturally the two images will differ, because the scenery is captured from two distinct positions of the cameras units. In the past there have already been attempts to exploit such pairs of related images for estimation of depth of object or locations in the images. If such calculation of depth can be performed with satisfying accuracy and reliability advantageous over other distance measurement means can be exploited. Cameras, and consequently also stereoscopic cameras, are of passive nature and therefore energy efficiency is one advantage. Energy efficiency is a general issue, but of course of particular relevance for systems that need to have an energy storage onboard like the accumulator of an autonomous lawn mower. Further scalability of such systems ease the adaptation of the systems, since only the distance between the two camera units (baseline) needs to be altered for using it for another range of distances.
In order to improve the performance of stereoscopic image processing systems different limitations have been addressed in the past. EP 2 386 998 A1 for example suggests a two-stage correlation method for correspondence search in stereoscopic images that deals with the problem of conventional approaches with respect to high contrast. Like in already known calculation of depth in stereoscopic image processing a patch of one of the two images is correlated with patches of corresponding size in the second image to find that patch in the second image that matches the patch in the first image best. From the distance of the best matching patch in the second picture from the patch in the first picture measured in pixels (disparity) it is then possible to calculate a distance or depth of the object.
For so called fronto-parallel objects in the images, this gives satisfying results. Each pixel of the patch that is defined in the first image finds its equivalent in a corresponding patch of same size and shape in the second image. This corresponding patch is similar to the patch in the first image but is located at a position in the second image that is different from the position of the patch in the first image. Thus, it is easy to calculate a measure for the similarity of the two corresponding patches. But naturally not all objects that can be identified in an image are fronto-parallel. This leads to a strong limitation of the usability of stereoscopic image analysis with respect to distance evaluation, for example in driver assistance systems, where typically objects like a road are present which is obviously not fronto-parallel. Not fronto-parallel objects lead to a spatially different arrangement of corresponding pixels in the first image and in the second image. The calculation of a measure for the similarity of pixels of a patch in the first image and of pixels of patch of same size and shape in the second image will lead to difficulties in identifying a disparity for such non fronto-parallel objects.
Stereo computation algorithms which are commonly known are described in “Scharstein, D. and Szeliski, R., (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47 (1-3):7-42.” But the problem of not fronto-parallel objects is not addressed there.