Three-dimensional scanning and digitization of the surface geometry of objects is commonly used in many industries and services, and their applications are numerous. A few examples of such applications are inspection and measurement of shape conformity in industrial production systems, digitization of clay models for industrial design and styling applications, reverse engineering of existing parts with complex geometry, interactive visualization of objects in multimedia, applications, three-dimensional documentation of artwork and artifacts, human body scanning for better orthotics adaptation, biometry or custom-fit clothing.
The shape of an object is scanned and digitized using a ranging sensor that measures the distance between the sensor and a set of points on the surface. Different principles have been developed for range sensors. Among them, interferometry, time-of-flight and triangulation-based principles are well-known principles that are each more or less appropriate depending on the requirements on accuracy, the stand-off distance between the sensor and the object, and the required depth of field.
Some triangulation-based range sensors are generally adequate for close range measurements, such as inferior to a few meters. Using this type of apparatus, at least two rays that converge to the same feature point on the object are obtained from two different viewpoints separated by a baseline distance. From the baseline and two ray directions, the relative position of the observed point can be recovered. The intersection of both rays is determined using the knowledge of one side length and two angles in the triangle, which actually is the principle of triangulation in stereovision. The challenge in stereovision is to efficiently identify which pixels correspond to each other in each image.
To simplify the problem, one can replace one of the light detectors (cameras) with a light projector that outputs a set of rays in known directions. In this case, it is possible to exploit the direction of the projected rays and each detected ray reflected on the object surface to solve the triangle. It is then possible to calculate the coordinates of each observed feature point relative to the basis of the triangle.
Although specialized light detectors can be used, digital CCD or CMOS cameras are typically used.
For the projector, the light source can be a coherent source (laser) or non-coherent source (e.g. white light) projecting a spot, a light plane or many other possible patterns of projection including a full-field pattern. A full-field pattern is a 2D pattern which can cover a portion or the whole of the projector's 2D field of illumination. In this case, a dense set of corresponding points can be matched in each image. Use of a light projector facilitates the detection of reflected points everywhere on the object surface so as to provide a dense set of measured surface points. However, the more complex the pattern will be, the greater the challenge will be to efficiently identify corresponding pixels and rays.
For this reason, one will further exploit properties from the theory of projective geometry. It has been well known in the field for at least 30 years in the case of two views that one may exploit epipolar constraints to limit the search of corresponding pixels to a single straight line, as opposed to the search in the entire image. This principle is widely exploited both in passive and active (with a projector) stereovision. One example of this usage is a system in which two cameras and a laser projector projecting a crosshair pattern are used. The arrangement of the two cameras and the laser is such that each of the laser planes composing the crosshair is aligned within an epipolar plane of each of the cameras. Thus, one of the laser planes will always be imaged in the same position in one image, independently of the observed geometry. It is then possible to disambiguate between the two laser planes in the image. This is a non-traditional application of epipolar geometry in structured light systems.
The epipolar geometry can be computed from calibration parameters or after matching a set of points in two images. Thus, given a point in one image, it is possible to calculate the parameters of the equation of the straight line (the epipolar line) in the second image where the corresponding point will lay. Another approach consists in rectifying the two images, which means all epipolar lines will be horizontal and aligned. Rectifying images is thus advantageous since no further calculation needs to be performed for identifying pixels on the epipolar lines. Image rectification can be applied by software or even by cautiously aligning the relative orientation of one or the two cameras (or projector). In this case, the approach is referred to as hardware alignment.
Several examples of hardware aligned cameras and projectors exist where the projector projects vertical stripes and the camera is aligned in such a way that the epipolar lines are horizontal. This type of alignment has been used in several other structured light systems exploiting Gray code vertical patterns. Projecting vertical stripes is less demanding on the alignment of the projector and cameras, but reduces the spatial density of points from a single projected frame. A full-field code can also be projected. The projector and camera are again aligned in such a way that the coded pattern along each line is projected along the epipolar lines in the projector slide. Under these circumstances, the scene geometry has nearly no effect on the direction and vertical separation of the row-coded pattern. These coded patterns will remain along a single line independently of the distance to the object. However, the relevant information to capture 3D measurements will be retrieved in the deformation of the code along the epipolar lines. This alignment with the epipolar lines makes it possible to project a different code along each line.
Unfortunately, there is an unresolved issue with the application of the principle of epipolar geometry. Its reliability varies depending on the type and quality of the projector lens. Actually, it does not account for lens distortion. In presence of lens distortion either for the projector and the camera, epipolar lines will not be straight lines. They will be curved and cannot be assumed to strictly result from the intersection of the epipolar plane with the image plane. Distortion is generally more important for short range systems requiring lenses with short focal lengths. Although it can be corrected after calibration through software calculation for the camera, it cannot be corrected afterwards for the projector. In this case, a code initially aligned along a straight line (epipolar) in the projector image (hereafter referred to as slide image) will not be physically projected along a straight line after the lens and will thus not result in a good alignment along the epipolar line in the image of the camera. For most lenses, distortion increases towards the side and corners of the images. One will either lose these points, compensate with larger bands for encoding the signal along the distorted epipolar lines (thus reducing resolution of measurement) or apply more complex calculations that will take away the initial goal of simplifying matching.