In imaging processing contexts such as computer stereo vision, 3-dimensional (3D) information may be extracted from 2-dimensional (2D) images. For example, by comparing information about a scene from two or more vantage points, 3D information may be generated by examining the relative position of objects. The 3D information may be provided, for example, as a disparity map or depth map or the like having a channel that contains information relating to the distance of a pixel position from a viewpoint or plane (e.g., a nominal focal plane) or the like.
In extracting such 3D information, the fundamental task of processing a stereo pair of images may be to perform stereo correspondence, which may determine which parts of one image (e.g., a left image) correspond to parts of another image (e.g., a right image). For example, a stereo matching pipeline may include pre-processing (e.g., domain transformation), cost computation (e.g., application of a similarity metric), cost aggregation (e.g., across a support window), disparity/depth estimation (e.g., local and/or global), and post-processing (e.g., refinement). In some implementations, the definition of the cost function (e.g., for cost computation) based on the specific transform being implemented and the cost volume filtering implemented in cost aggregation may largely determine the stereo matching performance.
It may be advantageous to perform stereo correspondence with greater accuracy in the resultant disparity map for improved computer stereo vision processing. It is with respect to these and other considerations that the present improvements have been needed. Such improvements may become critical as the desire to provide 3D image characteristics becomes more widespread.