1. Field of Invention
This invention relates to three dimensional (3D) surfacing mapping systems, specifically to systems that use a plurality of image sensors and optical flow to calculate Z-distances.
2. Prior Art
Reconstructing the 3D coordinates of points on surfaces in a scene from one or more two-dimensional (2D) images is one of the main topics of computer vision. The uses of such systems include navigation, mapping, gaming, motion analysis, medical imaging, and 3D photography.
In stereoscopic image processing, a pair of 2D images of the scene is taken by right and left cameras (stereo camera pair) from different positions, and correspondences (2D point pairs—one from each image that represent the same location in the 3D scene) between the images are found. Using the correspondences, the Z-distance (the distance between the optical center of one of the stereo cameras and the target) is found from the parallax according to the principle of triangulation using epipolar geometry.
Correspondences can be manually selected or automatically selected using one of several algorithms like corner detectors, normalized cross correlation, or dynamic programming. Finding accurate correspondences automatically is a difficult problem and has yet to be completely solved. This is due to a multitude of problems which include 1) occlusions—where one of the stereo cameras can see a point that is hidden from the other camera, 2) order swapping—in certain geometries, points in the 3D scene do not follow the same progression when projected onto a 2D image, 3) repetitive patterns in an image that allow multiple solutions to the correspondence finding problem, only one of which is correct, 4) shadows which change with viewing angle and lighting conditions, 5) reflections which change with viewing angle and lighting conditions, 6) focus which can change with viewing angle, and 7) coloration which can change with changing viewing angles and lighting conditions. The result of not being able to accurately determine correspondences is that the Z-distances cannot be determined with accuracy.
Optical flow is a technique originally developed by Horn and Schunck (Horn, B. K., and Schunck, B. G. (1980). Determining Optical Flow. Massachusetts Institute of Technology) that detects the “apparent velocities of movement of brightness patterns in an image.” The movement of brightness patterns can be used to infer motion in the 30 scene. However, absolute distances in the 3D scene cannot be determined without knowledge of the Z-distances and optical flow does not determine Z-distance.
Using optical flow as an added constraint to find correspondences between stereo images was presented by Slesareva, Bruhn, and Weickert (Slesareva, N., Bruhn, A., and Weickert, J. (2005). Optic Flow Goes Stereo: A Variational Method for Estimating Discontinuity—Preserving Dense Disparity Maps. DAFM 2005, LNCS 3663, pp. 33-40 2005.). Slesareva et al proposed a method of estimating depth by integrating the epipolar constraint in the optic flow method. This extra constraint reportedly improves the correspondence finding, but does not completely resolve the issues of finding the correspondences between two images that were acquired from different viewing angles because of the issues mentioned above.
Kim and Brambley (Kim, J., Brambley, G. (2008). Dual Opti-flow Integrated Navigation for Small-scale Flying Robots. ACRA 2008.) used a stereo pair of optical flow sensors to determine depth. However, finding correspondences between images that are taken at different viewing angles is as problematic for optical flow as it is for images for the same reasons described above. Additionally, Kim and Brambley's approach was incapable of detecting the difference between the distance between the camera and the surface and skewing between the image plane and that of the surface.
3D cameras using separate Z-distance range-finding systems are known in the art, for example: U.S. Pat. No. 6,323,942 entitled CMOS-Compatible Three-Dimensional Image Sensor IC, U.S. Pat. No. 6,515,740 entitled Methods for CMOS-Compatible Three-Dimensional Imaging Sensing Using Quantum Efficiency Modulation and U.S. Pat. No. 6,580,496 entitled Systems for CMOS-Compatible Three-Dimensional Imaging Sensing Using Quantum Efficiency Modulation. These patents disclose sensor systems that provide Z-distance data at each pixel location in the image sensor array for each frame of acquired data. Z-distance detectors according to the '942 patent determine Z-distance by measuring time-of-flight (TOF) between emission of pulsed optical energy and detection of target surface reflected optical energy. Z-distance systems according to the '740 and '496 patents operate somewhat similarly but detect phase shift between emitted and reflected-detected optical energy to determine Z-distance. Detection of reflected optical energy at multiple locations in the pixel array results in measurement signals that are referred to as dense depth maps. These systems have limited depth resolution due to the difficulty in timing the very short periods in which light travels and are subject to noise due to the reflection of the optical energy off nearby surfaces.
U.S. Pat. No. 8,134,637 discloses a depth camera which incorporates a beam splitter which breaks the incoming light into the visible light for image creation and the near infrared (NIR) light from an NIR light emitter. FIG. 1 shows the system of the '637 patent. This system emits NIR light from an emitter 105 modulated by modulator 125. The NIR light output 25 is focused on the target surface 40 via lens 115. The reflected NIR optical energy 30 enters lens 20′ coaxial to the higher resolution Red-Green-Blue (RGB) light energy. The beam splitting structure 140 and hot mirror 150 separate the NIR light energy from the RGB light energy. The RGB light energy goes to one image sensor array 160 on first array substrate 170 and the NIR light goes to a second lower resolution image sensor 130 on second substrate 170′. RGB data is processed by the RGB processor unit 65. NIR data is processed by Z-distance processor 135. While reportedly being able to improve the X and Y resolution of the Z-distance data, the resolution of the Z-distance data itself still suffers from the difficulty in accurately detecting extremely short TOF durations (on the order of pico-seconds) and noise caused by the NIR light energy reflecting off nearby surfaces.