This application claims the priority of German Application No. 19636028.5, filed Sep. 5, 1996, the disclosure of which is expressly incorporated by reference herein.
The invention relates to a method for detecting and possibly tracking objects by recording and evaluating stereo images. A method of this type is useful for example as an aid for automated guidance of highway vehicles, and also in the field of robotics, where it may be necessary to detect relevant objects and determine their position and size.
In autonomous vehicle guidance, for example in urban stop-and-go traffic, in addition to fixed objects located in the travel area of the vehicle, all moving objects in the immediate vicinity must be detected. On the other hand, no complete, dense depth chart of the scene is necessary for such applications. Thus in most cases a relatively flat base surface and clearly elevated objects can be used as a point of departure. In addition, no detailed information is required to be derived, such as the exact shape of a vehicle ahead. Such simplifying boundary conditions likewise apply to a plurality of problems in the fields of robotics and monitoring technology.
Methods of stereo image object detection can be divided into area-based methods and feature-based methods. Area-based methods are described, for example, in the conference papers by K. Sanejoschi, "3-D Image Recognition System by Means of Stereoscopy Combined with Ordinary Image Processing," Intelligent Vehicles '94, Oct. 24, 1994 to Oct. 26, 1994, Paris, pages 13 to 18 and L. Matthies et al., "Obstacle Detection for Unmanned Ground Vehicles: A Progress Report," Intelligent Vehicles '95, Sep. 25-26, 1995, Detroit, pages 66 to 71. They require a higher computing capacity than feature-based methods. For an overview of current stereo image object detection methods, see O. Faugeras, "Three-Dimensional Computer Vision," MIT Press, 1993.
A method for detection of objects, especially vehicles, is known from German patent document DE 44 31 479 A1, in which two images are taken of a given area from different viewing angles. From a comparison of the two images, especially their gray values, an object is detected for at least a partial area if the difference between the two images for the partial area in question is greater than a predetermined threshold.
In a system disclosed in German patent document DE 43 08 776 A1 for monitoring a state external to the vehicle, a stereo image object detection method is used by which a given object is imaged within a fixed region outside a vehicle. The images recorded are subjected to an image processing device which calculates distance distribution over the entire image. In order to discover a given object in the left and right images, the respective image is divided into small regions and color or brightness samples are compared within these regions for the two images in order to discover regions with corresponding object details and from this to determine the distance distribution over the entire stereo image.
German patent document EP 0 626 655 A2 describes a device for detecting vehicles ahead and for determining their distance, which uses a stereo image object detection method. For image evaluation, special techniques are employed, tailored to detection of vehicle contours, which use vehicle contour models. The corresponding disparity and hence the vehicle distance are determined for a recognized vehicle.
One object of the present invention is to provide a method for stereo image object detection that is especially suitable for the above applications, with simplified boundary conditions, and is comparatively efficient and reliable.
In the method according to the invention, a structure class image pair (that is, a pair of images, with pixel data processed as discussed below) is initially generated from the recorded stereo image pair, with the differences in brightness of predetermined pixels in the environment being determined for each pixel, in digital form. The digital values are then combined in a predetermined sequence to form a digital value group, each of the various possible groups thus defining a separate structure class (that is, a different pattern of brightness variation).
Next, an advantageous and simply-designed correspondence analysis is conducted, in which all structure classes are omitted from consideration that show no structure gradients in the direction of the epipolar line; that is, along the line of corresponding pixel pairs of a common original pixel. (These are the structure classes whose pixels do not differ in brightness by a predeterminable amount from the brightness of the ambient pixels located in the direction of the epipolar line.) This results in considerable savings in image processing, since structures that extend in this direction, by virtue of the system, are not useful in any case for determining distance in feature-based stereo image evaluation.
For all the other structure classes, disparity values of corresponding pixels in the same structure class are then determined and collected in a disparity histogram to form a frequency value. Optionally, for each corresponding pixel pair, the corresponding disparity value can be included in the histogram with a weighted frequency increment. Then the histogram is studied for grouping point areas. At fixed grouping point areas of interest, the corresponding pixel group of a corresponding structure class image is then conversely represented and interpreted as an object that is located at a certain distance. It turns out that this object detection method operates very efficiently for many applications, and offers reliable results which are less prone to error.
In one embodiment of the method according to the invention, a ternary logic is used to digitize the brightness differences. This arrangement permits a structural classification that is very advantageous for the applications under consideration here, for two reasons: first it permits sufficiently differentiated structure classifications; and second it offers structure classifications that can be performed rapidly.
In another embodiment, the four pixels are selected as ambient pixels that directly adjoin the respective reference pixel on both sides, parallel to and then perpendicular to the epipolar line, respectively, or are separated therefrom by a predeterminable sampling width. A sampling width with the size of one or possibly several pixels allows the incorporation of a larger neighborhood area, which is advantageous in cases in which the brightness gradients typically extend over a range of several pixels.
In a further embodiment, the necessary calculation is further decreased by the fact that no disparities are favored. For each pixel of one structure class image, only the minimal disparity (in other words the distance from the closest pixel with the same structure class in another structure class image) is determined and taken into account to plot the disparity histogram. The basic assumption of favoring small disparities and hence larger object distances is especially suitable for use in vehicles for viewing street scenes with their relatively large obstacles. With this procedure, the appearance of close phantom objects in correspondence analysis, like those that can be generated by remote periodic structures such as building fronts, is also suppressed.
In still another embodiment, in addition to the digital value group representing the brightness differences, each pixel is assigned a contrast value that differentiates more sharply the magnitudes of the brightness differences that occur, and is used as a weighting factor to form a weighted frequency increment. For each corresponding pixel pair for which a disparity value is determined, the same is then added with the contrast-dependent weighted frequency increment to the disparity histogram. As a result, structures with high contrast are taken into account to a greater extent in image evaluation, reducing background noise.
In yet another embodiment, object detection is expanded by cluster analysis of the pixel groups interpreted as belonging to an individual object. Such grouping is sufficiently simply structured, and also permits an improved location and tracking of movement of the object.
In still a further embodiment, continuous movement tracking is performed on an object after it has been detected, with object detection according to the invention being performed cyclically and repeatedly and limited to a minimum cube that contains the detected object. The limitation to the relevant image area in each case not only saves computer expense, but also reduces the influence of disturbing background objects in particular. In addition, only a relatively small range of disparity values needs to be evaluated, with the corresponding depth of the cube being determined from the maximum anticipated change in distance.
Finally, still another embodiment is suitable for highway vehicles and includes a plausible road model that makes it possible to filter out structures on the surface of the road with a high degree of reliability. This arrangement facilitates the detection of objects in the area above the surface of the road and/or reliable detection of markings on the roadway for autonomous vehicle transverse guidance.
Other objects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings.