The present invention relates to a system and a method for three dimensional imaging and depth measurement of objects using active triangulation methods, and, more particularly, but not exclusively to three dimensional imaging of both objects at rest and in motion.
Three dimensional sensor systems are increasingly being used in a wide array of applications. These sensor systems determine the shape and or features of an object positioned in a scene of the sensor system's view. In recent years, many methods have been proposed for implementing 3D modeling systems that are capable of acquiring fast and accurate high resolution 3D images of objects for various applications.
The precise configuration of such 3D imaging systems may be varied. Most state of the art systems are not image based and have a synthetic non life-like look to them, much the same as computer graphics based video games. Furthermore, many current triangulation-based systems use an array of at least two or more passive high resolution cameras to determine the depth values by what is known as passive stereo correspondence. Such a method, while indeed acquiring texture and being of possibly high resolution, is both labor intensive and error prone. Passive stereo correspondence systems require manual assistance in defining correspondence between frames from each camera to calculate depth values. Automatic correspondence algorithms often contain an abundance of errors in matching between shots from different cameras, thus requiring human intervention for correspondence.
Other methods utilize LIDAR (Light Imaging Detection and Ranging) systems to determine range and/or other information of a distant target. By way of laser pulses, the distance to an object is determined by measuring the time delay between transmission of the laser pulse and detection of the reflected signal. Such methods, referred to as time-of-flight, are generally immune to occlusions typical of triangulation methods, but the accuracy and resolution are inherently inferior to that obtained in triangulation methods.
Triangulation based 3D sensor systems and methods typically have one or more projectors as a light source for projecting onto a surface and one or more cameras at a defined, typically rectified relative position from the projector for imaging the lighted surface. The camera and the projector therefore have different optical paths, and the distance between them is referred to as the baseline. Through knowledge of the baseline distance as well as projection and imaging angles, known geometric/triangulation equations are utilized to determine distance to the imaged object. The main differences among the various triangulation methods known in the art lie in the method of projection as well as the type of light projected, typically structured light, and in the process of image decoding to obtain three dimensional data.
Methods of light projection vary from temporal methods and phase shift methods, to spatial coded structured light and stereoscopic methods. Examples in the art of various forms of projected light include “laser fans” and line coded light.
Once a 2D image of the object is captured upon which a light source is projected as described above, image processing software generally analyzes the image to extract the three dimensional geometry of the object and possibly the three dimensional movement of the object through space. This is generally done by comparison of features in the captured image with previously captured images and/or with known characteristics and features of the projected light. The implementation of this step varies widely among currently known methods, typically a function of the method used to project light onto the object. Whatever the method used, the outcome of the process is generally a type of disparity/displacement map of identified features in the captured image. The final step of 3D spatial location and/or 3D motion capture involves the translation of the above mentioned disparity map into depth data, according to well known geometric equations, particularly triangulation equations.
The very fact that hundreds of methods and systems exist today hints at the underlying problem of a lack of a sufficiently effective and reliable method for 3D imaging. Furthermore, most of the systems that utilize active triangulation methods today are restricted to non dynamic imaging of objects. That is to say, even at high shutter speeds and using high resolution cameras, the imaged object must remain static. For example, a building may be imaged, but not a person riding a bike or cars moving on the street. This limitation on three dimensional imaging is a direct result of the need in most triangulation based 3D imaging systems to obtain a series of images while changing the characteristics of the light source over time. For example, many methods utilize a number of light patterns projected over a time interval, known as temporal coding.
Nonetheless, many methods have been introduced over the years for the three dimensional imaging of moving objects, most of which are based on the projection of a single pattern of light on the imaged object, thus enabling reconstruction of the depth measurements from one or more simultaneous images rather than multiple images over a time interval. These single pattern methods can be broken down into two main classes. The first is assisted stereo methods wherein a comparison is made between two or more images from two or more imaging systems to compute depth data.
The second is structured light methods, and in particular coded structured light methods. These methods often use only one imaging system or camera. Coded structured light methods can further be broken down into several encoding types. One such method using coded structured light is spatial coding, which suffers from a wide range of problems of precision and reliability, particularly regarding feature identification, and other serious performance limitations. As a result, spatial single pattern systems have been implemented commercially only in a very limited manner. A further structured coding technique in the art is spectral coding, which requires a highly non textured surface or uniform textured surface and requires expensive imaging systems. As a result, these spectral based systems do not adequately provide the performance necessary for high texture applications.
Therefore, there is an unmet need for, and it would be highly useful to have, a system and a method that overcomes the above drawbacks.