This invention relates generally to non-contact methods for obtaining three-dimensional or range measurements of objects. More particularly, it relates to a structured light (or other radiation) range scanner using projected patterns that permit real-time range measurements of moving scenes with minimal surface continuity and reflectivity assumptions.
The ability to determine the distance to objects or surfaces in a three-dimensional spatial scene is becoming increasingly important in many fields, such as computer graphics, virtual and augmented reality, robot navigation, manufacturing, object shape recognition, and medical diagnostics. A large variety of non-contact techniques have been developed to obtain range images, two-dimensional arrays of numbers that represent the distance or depth from the imaging instrument to the imaged object. A range image measures the location of each point on an object""s surface in three-dimensional space. Active range scanning methods include time-of-flight, depth from defocus, and projected-light triangulation. Triangulation-based methods, such as structured light range scanning, are applicable to a wider range of scene scales and have lower hardware costs than other active methods.
Devices that obtain range images by triangulation methods are known as range scanners. A typical prior art structured light range scanning system 10 is illustrated in FIG. 1. A light projector 12 projects a plane of light 14 onto a three-dimensional object 16 to be imaged, creating a narrow stripe. The image of each illuminated point of object 16 is detected by a camera 18 at a particular two-dimensional location on the camera image plane 20. The intersection of the known illumination plane 14 with a camera line of sight uniquely determines a point on the object surface. Provided that the relative geometry of the camera and projector is accurately known, the three-dimensional locations of surface points of object 16 can be determined through triangulation. Obtaining a range image of the entire scene requires scanning of the light plane 14 in time, a relatively time-consuming process, and currently not feasible for moving scenes or real-time data acquisition.
In order to speed up the imaging process and eliminate mechanical scanning, two-dimensional projected light patterns have been used. Rather than a single plane, a number of distinct planes are projected onto the scene; the image obtained therefore contains range information for the entire scene, and not just a small slice of it. Such illumination patterns introduce a new problem: identifying correspondences between image positions and pattern positions. When only a single light plane is projected, an illuminated part of the image must correspond to the projected plane. If a number of planes are projected, however, a camera pixel receiving light originating from the projector may correspond to any one of the projected planes, and it is necessary to determine the responsible plane.
Highly reliable identification of light planes with minimal assumptions about the nature of the scene can be achieved by time multiplexing, i.e., by sequentially projecting several different illumination patterns. The sequence of intensity values received at each camera pixel defines a unique code that identifies a location of the projection pattern, therefore allowing triangulation to be performed. Range data can be computed after the full sequence of projection patterns has been captured by the camera. A large number of pattern systems have been developed for a variety of scene constraints, each one having different advantages applicable to different types of scenes. Two constraints applicable to static scenes are the surface continuity and reflectivity of the scene. A surface reflectivity assumption refers to the similarity of reflectivity of adjacent surface regions in the scene. If the scene is of uniform color, then it is much easier to obtain information from the reflected intensities. Spatially varying reflectivities require different decision thresholds for different pixels of the detector. For example, consider a pattern that contains three different projected light intensities. The same projected intensity results in different reflected and imaged intensities when reflected from different scene locations, and a decision about the detected intensity level requires at least some knowledge of the reflectivity of the corresponding scene surface. Similar considerations apply to the surface continuity of a scene. Scenes with high surface continuity, i.e., smoothly changing surfaces, such as human forms, allow correlation of pattern features across large distances. Low-surface-continuity scenes, however, require that codes be ascertained from nearby pixels, without requiring information from far-away detector pixels.
One well-known system of time-modulated illumination patterns uses binary Gray codes, projecting a series of stripes that decrease in width in sequential patterns. An early use of Gray codes is described in K. Sato and S. Inokuchi, xe2x80x9cThree-Dimensional Surface Measurement by Space Encoding Range Imaging,xe2x80x9d J. Robotic Systems, 2, 27-39, 1985, and a large number of variations are available in the art. A sequence of Gray coded patterns, each projected at a particular time ti, is shown in FIG. 2. Each pattern can be thought of as a bit plane for a Gray code, considering a shaded stripe as a 0 bit and a white stripe as a 1 bit. For example, the scene surface that receives light at the location marked with the center of an X sees the bit code 1 1 1 0 1. As long as the scene does not move, a particular camera pixel (or multiple pixels) receives light reflected from this particular scene surface over the duration of the projection sequence, and the detected 1 1 1 0 1 code is used to determine the projector location corresponding to this camera pixel. The number of patterns needed is determined by the desired resolution, which may itself be determined by the camera or projector. For a maximum of N distinguishable horizontal positions, a Gray coded pattern requires log2 N patterns.
Structured light patterns based on Gray codes are very robust and widely used. They require virtually no assumptions about the scene reflectivity or surface continuity. Correspondences are determined using only single-pixel information, and pixel thresholds can be determined initially by projecting all-dark and all-bright patterns. However, a relatively large number of patterns is needed, demanding that the scene remain static for the duration of the pattern sequence. Thus if real-time range data is needed, or moving scenes are imaged, a different pattern system is required.
One approach for imaging moving scenes is to use a xe2x80x9cone-shotxe2x80x9d system of projection patterns. A single pattern is continually projected, and range information is obtained for each camera image. Provided that the scene movement is not so rapid as to cause motion blur in the captured images, one-shot systems can be used to obtain range data for moving scenes. The drawback of one-shot systems is that they are limited to scenes having relatively constant reflectivity and a high amount of continuity. For example, color projector patterns have been used in which each projector pixel transmits light with a distinct ratio among the red, green, and blue (RGB) or hue, saturation, and intensity (HSI) color components. The detected RGB or HSI value at each camera pixel determines the correspondence between the camera pixel location and projector pixel location. In a different system, the pattern consists of a continual fade from black to white, with detected gray-scale intensity values used to determine the responsible projector pixels. Both of these patterns require a relatively constant and known reflectivity for all locations within the scene. An alternative one-shot system projects a repetitive pattern such as a grid. An example of such a method is described in M. Proesmans et al., xe2x80x9cOne-Shot Active 3D Shape Acquisition,xe2x80x9d Proceedings of the International Conference on Pattern Recognition, 336-340, 1996. Assuming a relatively smooth scene surface, distortions of the grid angles and lines on a global scale are used to determine correspondences. However, the projected grid must remain connected in the image of the illuminated scene in order to allow accurate correspondences to be determined.
There is still a need, therefore, for a triangulation-based structured light range scanner that can obtain real-time image data of moving scenes without assuming global surface continuity or uniform reflectivity of the scene.
Accordingly, it is a primary object of the present invention to provide a range scanning method that produces real-time range images of moving scenes.
It is a further object of the invention to provide a range scanning method that uses a minimal number of projection patterns while making only local assumptions about the continuity or reflectivity of a scene surface.
It is an additional object of the invention to provide a range scanning method for generating complete three-dimensional models of rigid objects.
It is another object of the present invention to provide a range scanning method that uses commercially-available, inexpensive hardware.
It is a further object of the invention to provide a range scanning method that can be adapted to a wide variety of scene constraints and computational requirements.
It is to be understood that not all embodiments of the present invention accomplish all of the above objects.
These objects and advantages are attained by a method for real-time range scanning of a scene, possibly moving, containing the following steps: projecting a sequence of N radiation patterns onto a scene; capturing a current image of the scene with an image detector; identifying features of the current image with matching features in a distinct image, for example, the previous image; determining fixed projection positions within the radiation pattern that correspond to current image features; and computing a range for each current image feature by geometrical analysis, such as triangulation, based on the corresponding fixed projection position. The method may also include a step of estimating a spatially varying surface reflectivity of the scene.
Each radiation pattern contains a set of pattern features at fixed projection positions. The sequence of N matching pattern features at each fixed projection position defines a code, which is preferably unique and permits identification of the image feature with its corresponding projection position. In the current image, the position of each image feature may be different from the position of its matching image feature in the distinct image, because at least part of the scene may have moved between the distinct and current images. Preferably, the distance between matching image features in sequential images is below a threshold distance. The code for each image feature is determined from the current image and differently-timed images, preferably images corresponding to the Nxe2x88x921 preceding or succeeding patterns, and is used in determining corresponding projection positions.
The radiation used to generate the patterns may be electromagnetic radiation, such as visible light, infrared light, or x-ray radiation, in which case feature values may be light intensity values or RGB values. Features are preferably uniformly spaced in the projection pattern, with the spacing in part determined by a resolution of the image detector. The value of N is preferably the minimum number of patterns required to provide unique codes, given a specific number of features. The patterns may be selected in part in dependence on a computational efficiency of the various computation steps of the method. In one embodiment, the pattern features are stripe boundaries.
While the method is particularly advantageous for moving scenes, it can also be used for static scenes. In this case, the position of matching image features does not move between images. The pattern consists of a set of parallel stripes defining projected stripe boundaries that are identified by stripe values on each side of the boundary. A sequence of N stripe boundaries defines a code, preferably unique, for each fixed projection position. Preferably, the stripe boundaries are uniformly spaced, as determined by a resolution of the detector or projector.
The present invention also provides a system of implementing the above methods containing a radiation projector for projecting the patterns, an image detector for capturing the images, and a data processor for performing computations to obtain a range image for each captured image.
Also provided is a program storage device accessible by a computer in communication with an image detector. The program storage device tangibly embodies a program of instructions executable by the computer to perform the method steps discussed above for real-time range scanning of moving scenes.