An image can be considered a projection from a three-dimensional (3D) scene onto a two-dimensional (2D) plane. Although a 2D image does not provide depth information, if two images of the same scene are available from different vantage points, the position (including the depth) of a 3D point can be found using known techniques.
For example, stereo matching is a process in which two images (a stereo image pair) of a scene taken from slightly different vantage points are matched to find disparities (differences in position) of image elements which depict the same scene element. The disparities provide information about the relative distance of the scene elements from the camera. Stereo matching enables disparities (i.e., distance data) to be computed, which allows depths of surfaces of objects of a scene to be determined. A stereo camera including, for example, two image capture devices separated from one another by a known distance, which may be referred to as the baseline distance, can be used to capture the stereo image pair.
Global-optimization methods, which are a type of stereo-matching, generate a disparity for each pixel in the reference image. Such approaches take into account global constraints (e.g., smoothness and/or image edges) and seek to minimize an energy function, which determines whether two pixels (i.e., one in the reference image, and one in the search image) are a match. A global-optimization energy function typically is comprised of two terms: a matching cost term (M) and a smoothness term (S). The matching cost term (M) is a function of the reference and search pixel intensities (IR, IS) and all possible pixel disparities (DR-S), and the smoothness term (S) is a function of the intensities of adjacent pixels in the reference image (IR). The matching cost term (M) typically depends on the difference in intensity between two (potentially) matching pixels in the reference and search images. The smaller the difference in the intensities between the two (potentially) matching pixels, the more likely it is that the pixels match (i.e., that they correspond to the same feature of an object in the scene whose image was captured). The smoothness term accounts for global constraints.
Generally, the intensity change should be small (i.e., smooth) for adjacent pixels unless there is, for example, an edge. The smoothness term range often is normalized to values between 0 and 1. An edge (i.e., strong differences in intensity between adjacent pixels) corresponds to a 0 (or near zero), whereas a perfectly smooth region corresponds to a 1 (or near 1). The goal is to minimize the smoothness and matching cost terms, thereby minimizing the overall energy function. In other words, it is desirable to achieve a low matching cost (indicating that the reference/search pixel pair have nearly the same intensity) and large smoothness (indicating that adjacent pixels in the reference image have nearly the same value (e.g., nearly the same intensity and/or RGB values)).
Stereo channels sometimes are implemented as low-resolution imagers (e.g., VGA resolution), which can result in edges of objects in a scene not being captured well in the images. When this happens, the smoothness term is not calculated properly. For example, large differences in intensity for adjacent pixels may not be captured, with the result that part of the image that should be assigned a smoothness value of “0” instead is assigned a value of “1”. In such situations, the adjacent pixels tend to have more similar disparities (i.e., in order to minimize the energy function term) than would otherwise be the case. Thus, over-smoothing occurs, which results in a poor-quality depth map. More generally, over smoothing can result in a final disparity map in which changes in disparity between adjacent pixels is determined to be smaller than they should be at the edges (i.e., because the wrong pixels were matched).
Semi global block matching (SGBM) is a particular technique for minimizing the global energy function. SGBM can provide advantages similar to those of other global optimization techniques, and also tends to be computationally fast. Further, SGBM can be applied to any stereo matching pair (e.g., RGB, grey-scale, high-resolution, low-resolution). However, SGBM is not always effective in determining disparities for pixels associated with object edges (e.g., when the input edges are low-resolution) and can lead to over smoothing.