1. Field of the Invention
The disclosed technology relates to the field of computer vision, and more particularly to an imaging system and method that generate a depth map.
2. Description of the Related Technology
Stereo-matching estimates disparity distances between corresponding pixels in a pair of stereo images or videos captured from parallel cameras in order to extract depth information of objects in a scene. Stereo-matching has many applications such as 3D gesture recognition, viewpoint synthesis, and stereoscopic TV.
In general, imaging methods that generate a depth map can be classified into two categories: global and local methods.
Global methods usually formulate the stereo-matching problem as an energy function with the objective to find a disparity function d that minimizes a global energy. The energy function may be expressed by the following equation.E(d)=Edata(d)+λEsmooth(d)  (1)where Edata(d) measures how well the disparity function d agrees with the stereo image pair, and Esmooth(d) encodes the smoothness assumptions made by the method, and measures differences between neighboring pixels' disparities. Once the energy function is formulated, it may be minimized using, for example, dynamic programming, graph cuts and belief propagation.
Local methods estimate a disparity distance of a pixel independently over a window. Matching costs for the pixel in one of the stereo images and a candidate matching pixel in the other of the stereo images are aggregated over the window. The minimum matching cost among a plurality of candidate matching pixels with different disparity distances may be identified for selection of the disparity level of the pixel on a disparity map.
For both of the methods, there exists a tradeoff between depth map quality and computational complexity. For example, for global methods, the smoothness term Esmooth(d) in the energy function may use a larger neighborhood such as a neighborhood of 8 instead of 2 in order to obtain better boundaries. However, the computational complexity for optimizing the global energy is approaching being intractable. For local methods, the complexity is much lower than that of the global methods but at the cost of quality. To enhance quality, a higher number of bits may be used to represent a pixel so as to obtain a finer disparity map. However, matching cost calculation and aggregation is performed on per-pixel basis. With the higher number of bits per pixel, the computational complexity is significantly increased.
Therefore, it is highly desirable to provide an imaging system and method that achieve an enhanced quality of disparity map without aggravating the complexity of the more computationally intensive part of the imaging system and method.