The present invention relates to a system and a method for three dimensional imaging and depth measurement of objects using active triangulation methods, and, more particularly, but not exclusively to three dimensional imaging of both objects at rest and in motion.
Three dimensional sensor systems are used in a wide array of applications. These sensor systems determine the shape and or surface features of an object positioned in a scene of the sensor system's view. In recent years, many methods have been proposed for implementing 3D modeling systems that are capable of rapid acquisition of accurate high resolution 3D images of objects for various applications.
The precise configuration of such 3D imaging systems may be varied. Many current triangulation-based systems use an array of at least two or more cameras to determine the depth values by what is known as passive stereo correspondence. Such a method is dependent on the imaged surfaces being highly textured and therefore error prone and non-robust. Furthermore, automatic correspondence algorithms often contain an abundance of errors in matching between shots from different cameras.
Other methods utilize LIDAR (Light Imaging Detection and Ranging) systems to determine range and/or other information of a distant target. By way of light pulses, the distance to an object is determined by measuring the time delay between transmission of the light pulse and detection of the reflected signal. Such methods, referred to as time-of-flight, are generally immune to occlusions typical of triangulation methods, but the accuracy and resolution are inherently inferior to that obtained in triangulation methods.
Active triangulation-based 3D sensor systems and methods typically have one or more projectors as a light source for projecting onto a surface and one or more cameras at a defined, typically rectified relative position from the projector for imaging the lighted surface. The camera and the projector therefore have different optical paths, and the distance between them is referred to as the baseline. Through knowledge of the baseline distance as well as projection and imaging angles, known geometric/triangulation equations are utilized to determine distance to the imaged object. The main differences among the various triangulation methods known in the art lie in the method of projection as well as the type of light projected, typically structured light, and in the process of image decoding to obtain three dimensional data.
Methods of light projection vary from temporal methods to spatial coded structured light. Examples in the art of various forms of projected light include “laser fans” and line coded light.
Once a 2D image of the object is captured upon which a light source is projected as described above, image processing software generally analyzes the image to extract the three dimensional geometry of the object and possibly the three dimensional movement of the object through space. This is generally done by comparison of features in the captured image with previously captured images and/or with known characteristics and features of the projected light. The implementation of this step varies widely among currently known methods, typically a function of the method used to project light onto the object. Whatever the method used, the outcome of the process is generally a type of disparity/displacement map of identified features in the captured image. The final step of 3D spatial location and/or 3D motion capture involves the translation of the above mentioned disparity map into depth data, according to well known geometric equations, particularly triangulation equations.
The very fact that hundreds of methods and systems exist today hints at the underlying problem of a lack of a sufficiently effective and reliable method for 3D imaging. Furthermore, most of the systems that utilize active triangulation methods today are restricted to non dynamic imaging of objects. That is to say, even at high frame rates and shutter speeds, the imaged object must remain static during image acquisition. For example, a building may be imaged, but not a person riding a bike or cars moving on the street. This limitation on three dimensional imaging is a direct result of the need in most triangulation based 3D imaging systems to obtain a series of images while changing the characteristics of the light source over time. For example, many methods utilize a number of light patterns projected over a time interval, known as temporal coding.
Nonetheless, many methods have been introduced over the years for the three dimensional imaging of moving objects, most of which are based on the projection of a single pattern of light on the imaged object, thus enabling reconstruction of the depth measurements from one or more simultaneous images rather than multiple images over a time interval. These single pattern methods can be broken down into two main classes. The first is assisted stereo methods wherein a single pattern is projected and a comparison is made between two or more images from two or more imaging systems to compute depth data.
The second is structured light methods, and in particular coded structured light methods. These methods often use only one imaging system or camera. Coded structured light methods can further be broken down into several encoding types. One such type using coded structured light is spatial coding, which suffers from a wide range of problems of precision and reliability, particularly regarding feature identification, and other serious performance limitations. As a result, spatial single pattern systems have been implemented commercially only in a very limited manner. A further structured coding technique in the art is spectral or color coding, which requires a color neutral surface and usually requires expensive imaging systems.
Therefore, there is an unmet need for, and it would be highly useful to have, a system and a method that overcomes the above drawbacks.