The present invention relates to automated vision systems, and more particularly to a system for three-dimensional object segmentation.
Passive techniques of steropsis involve triangulation of features viewed from different positions or at different times, under ambient lighting conditions, as described in xe2x80x9cStructure From Stereoxe2x80x94A Review,xe2x80x9d Dhond, Umesh R, and Aggarwal, J. K., IEEE Transactions On Systems, Man, And Cybernetics, Vol. 19, No, 6, November/December 1989. The major steps in stereopsis are preprocessing, matching, and recovering depth information. As described in the reference, the process of matching features between multiple images is perhaps the most critical stage of stereopsis. This step is also called the correspondence problem.
It is also well known that stereo matching using edge segments, rather than individual points, provides increased immunity from the effects of isolated points, and provides an additional disambiguating constraint in matching segments of different stereoscopic images taken of the same scene. A variety of algorithms can be used for matching edge segments that meet criteria for 3-D segments occurring along a smooth surface. In addition, a trinocular camera arrangement provides further information that can improve a binocular depth map with points (or edges) matched if they satisfy additional geometric constraints, such as length and orientation.
Once the segmented points have been identified and the depth information recovered, the 3-D object structure can be obtained which can then be used in 3-D object recognition. The purpose of this embodiment is more to segment the 3-D scene into 3-D objects that are spatially separated in a 2-D plane, rather than object recognition. Therefore, an elaborate 3-D object re-construction is not necessary.
However, the prior combinations of feature detection, matching, 3-D segmentation are computationally intensive, either decreasing speed or increasing cost of automated systems. Furthermore, prior methods lack robustness because of susceptibility to noise and confusion among match candidates. 3-D data is mostly used for object recognition, as opposed to segmentation of objects placed in a plane in 3-D space. Known techniques, typically using 2D segmentation, assume a fixed relationship between the camera system and the plane under consideration, that is, they do not facilitate specifying any arbitrary plane.
The present invention provides a three-dimensional (3-D) machine-vision object-segmentation solution involving a method and apparatus for performing high-integrity, high efficiency machine vision. The machine vision segmentation solution converts stereo sets of two-dimensional video pixel data into 3-D point data that is then segmented into discrete objects, and subsequent characterization of a specific 3-D object, objects, or an area within view of a stereoscopic camera. Once the segmented points have been identified and the depth information recovered the 3-D object structure can be obtained which can then be used in 3-D object recognition.
According to the invention, the 3-D machine-vision segmentation solution includes an image acquisition device such as two or more video cameras, or digital cameras, arranged to view a target scene stereoscopically. The cameras pass the resulting multiple video output signals to a computer for further processing. The multiple video output signals are connected to the input of a video processor adapted to accept the video signals, such as a xe2x80x9cfame grabberxe2x80x9d sub-system. Video images from each camera are then synchronously sampled, captured, and stored in a memory associated with a data processor (e.g., a general purpose processor). The digitized image in the form of pixel information can then be accessed, archived, manipulated and otherwise processed in accordance with capabilities of the vision system. The digitized images are accessed from the memory and processed according to the invention, under control of a computer program. The results of the processing are then stored in the memory, or may be used to activate other processes and apparatus adapted for the purpose of taking further action, depending upon the application of the invention.
In further accord with the invention, the 3-D machine-vision segmentation solution method and apparatus includes a process and structure for converting a plurality of two-dimensional images into clusters of three-dimensional points and edges associated with boundaries of objects in the target scene. A set of two-dimensional images is captured, filtered, and processed for edge detection. The filtering and edge detection are performed separately for the image corresponding to each separate camera, resulting in a plurality of sets of features and chains of edges (edgelets), characterized by location, size, and angle. The plurality is then sub-divided into stereoscopic pairs for further processing, i.e., Right/Left, and Top/Right.
The stereoscopic sets of features and chains are then pair-wise processed according to the stereo correspondence problem, matching features from the right image to the left image, resulting in a set of horizontal disparities, and matching features from the right image to the top image, resulting in a set of vertical disparities. The robust matching process involves measuring the strength and orientation of edgelets, tempered by a smoothness constraint, and followed by an iterative uniqueness process.
Further according to the invention, the multiple (i.e., horizontal and vertical) sets of results are then merged (i.e., multiplexed) into a single consolidated output, according to the orientation of each identified feature and a pre-selected threshold value. Processing of the consolidated output then proceeds using factors such as the known camera geometry to determine a single set of 3-D points. The set of 3-D points is then further processed into a set of 3-D objects through a xe2x80x9cclusteringxe2x80x9d algorithm which segments the data into distinct 3-D objects. The output can be quantified as either a 3-D location of the boundary points of each object within view, or segmented into distinct 3-D objects in the scene where each object contains a mutually exclusive subset of the 3-D boundary points output by the stereo algorithm.
Machine vision systems effecting processing according to the invention can provide, among other things, an automated capability for performing diverse inspection, location, measurement, alignment and scanning tasks. The present invention provides segmentation of objects placed in a plane in 3-D space. The criterion for segmentation into distinct objects is that the minimum distance between the objects along that plane (2D distance) exceed a preset spacing threshold. The potential applications involve segmenting images of vehicles in a road, machinery placed in a factory floor, or objects placed on a table. Features of the present invention include the ability to generate a wide variety of real-time 3-D information about 3-D objects in the viewed area. Using the system according to the invention, distance from one object to another can be calculated, and the distance of the objects from the camera can also be computed.
According to the present invention a high accuracy feature detector is implemented, using chain-based correspondence matching. The invention adopts a 3-camera approach and a novel method for merging disparities based on angle differences detected by the multiple cameras. Furthermore, a fast chain-based clustering method is used for segmentation of 3-D objects from 3-D point data on any arbitrary plane. The clustering method is also more robust (less susceptible to false images) because object shadows are ignored.