Traditional three-dimensional (3D) scene reconstruction algorithms, such as stereo, space-time stereo, multi-view stereo or structured light, measures distance to the objects of the scene by analyzing the triangles constructed by the projection rays of two optical systems. Traditional stereo systems use two cameras to find two projection rays from the two camera centers which meet at the object surface being measured. Space-time stereo systems use two cameras and a projector or a time-varying light resource for 3D scene reconstruction. Multi-view stereo systems use multiple cameras for 3D scene reconstruction. Traditional stereo systems, e.g., space-time stereo and multi-view stereo, require solving a correspondence problem which determines which pairs of points in two images captured by camera(s) are projections of the same point in an object being processed. The process to solving the correspondence problem is generally referred to as epipolar search. This is a very complex problem and computationally expensive.
The camera calibration process estimates the internal and/or external parameters of a camera that affect the 3D scene reconstruction. The internal parameters of a camera, such as camera focal length, optical center location, skewness of image pixels, and radial distortion of the lens, are often called intrinsic camera parameters. The external parameters of a camera define the location and orientation of a camera with respect to some world coordinate system. The external parameters of a camera are often called extrinsic camera parameters. Two calibrated optical systems with respect to a known reference coordinate system, such as two calibrated cameras, or one calibrated camera and one calibrated projector, are related to each other by a translation matrix and a rotation matrix which map a pixel point in one optical system to a corresponding pixel point in the other optical system. Camera-camera calibration calculates the translation and rotation matrices between the calibrated cameras. Similarly, camera-projector calibration calculates the translation and rotation matrices between the calibrated projector and camera.
FIG. 1A is a block diagram illustrating a traditional space-time stereo 3D scene reconstruction system. The reconstruction system illustrated in FIG. 1A comprises a pair of cameras, 102a and 102b, a projector 104 and a camera-camera calibration module 106. The projector 104 projects light patterns to disambiguate correspondence between a pair of projection rays, resulting in a unique match between pixels captured in both cameras 102. A reference coordinate system for reconstruction is defined in one of the cameras 102. The projector 104 needs not be calibrated with the cameras 102 nor defined in the stereo rig of the two cameras 102. For each camera 102, the camera calibration module 106 calibrates the camera 102 and generates extrinsic camera parameters with respect to the reference coordinate system. The camera-camera calibration module 106 also calculates the rotation and translation matrices between the cameras 102 using the intrinsic and extrinsic camera parameters of the cameras 102. The camera calibration module 106 may use any existing camera calibration algorithm which is readily known to a person of ordinary skills in the art such as photogrammetric calibration or self-calibration. To reconstruct the 3D scene, the rotation and translation matrices between the cameras 102 obtained by camera calibration are needed. Additionally, the computational expensive epipolar search for correspondence described above is needed for the reconstruction.
To reconstruct a 360-degree view of a scene, cameras in a traditional stereo or structured light reconstruction system need to go around the scene, and generate depthmaps or point clouds from many views. The reconstruction system then links the reconstructed pieces together. FIG. 1C is a block diagram illustrating a traditional multi-view stereo 3D scene reconstruction system. The reconstruction system in FIG. 1C includes three calibrated cameras 102a-102c and two camera calibration-camera modules 106a and 106b. More calibrated cameras 102 and camera calibration modules 106 may be used for multi-view reconstruction. The camera-camera calibration module 106a calibrates the cameras 102a and 102b and generates the rotation and translation matrices between the calibrated cameras 102a and 102b. Similarly, the camera-camera calibration module 106b calibrates the cameras 102b and 102c. As the traditional space-time stereo reconstruction system illustrated in FIG. 1A, the multi-view stereo reconstruction system in FIG. 1C requires computational expensive epipolar search for correspondence for 3D scene reconstruction.
Traditional structured light systems for 3D scene reconstruction employ one camera and a projector. Light travels from the projector, and is reflected to the surface of the camera. Instead of finding two corresponding incoming projection rays, a structured light algorithm projects specific light pattern(s) onto the surface of an object, and from the observed patterns, the algorithm figures out which projection ray is reflected on the surface of the object in the scene and reaches the camera. Traditional structured light systems require the projected patterns to be well differentiated from the other objects and ambient light falling on the surface of the object being processed. This requirement often translates into a requirement for a high powered and well focused projected light.
FIG. 3A is a block diagram illustrating a structured light reconstruction method running in a traditional structured light 3D scene reconstruction system. The reconstruction system in FIG. 3A includes a camera 102, a reference coordinate system 308 defined in the camera 102, a projector 104, a camera-projector calibration module 116a and an object being processed 302. The camera-projector calibration module 116a calibrates the camera 102a and the projector 104. To calibrate the projector 104 with respect to the reference coordinate system 308, the camera-projector module 116a may use any existing projector calibration method that is known to a person of ordinary skills in the art, such as P. Song, “A theory for photometric self-calibration of multiple overlapping projectors and cameras,” IEEE International Workshop on Projector-Camera Systems, 2005, which is incorporated by reference herein in its entirety.
To reconstruct the 3D image of the object 302, the projector 104 projects a plane 306a of a calibration pattern onto the object 302, and the camera 102 observes a reflected ray 306b from the object 302. The reconstruction system in FIG. 3A identifies which projected ray is reflected on the surface of the object 302 and reached the camera 102. For example, the system identifies that an observed point 304 on the surface of the object 302 is the intersection point from the projection plane 306a projected from the projector 104 and the viewing ray 306b from the camera 102. The reconstruction system illustrated in FIG. 3A requires the projected patterns to be well differentiated from the other objects in the scene and ambient light falling on the surface of the object 302 being processed. This requirement often translates into a requirement for a high powered and well focused projected light. In addition, the reconstruction system in FIG. 3A needs to know the camera 102 position relative to the projector 104, i.e., the rotation and translation matrices between the camera 102 and the projector 104, for the reconstruction.
In a conventional camera-projector system using traditional 3D scene reconstruction algorithms, camera(s) need to be attached rigidly to a projector rig, and the rotation and translation matrices between a camera and a projector are required for the reconstruction. Consequently, the reconstruction process is often very computationally expensive, or requires camera motion tracking or mesh stitching to integrate multiple snapshots into a full 3D model.