Accurately estimating depth from stereo images, in other words to generate a disparity map therefrom is a core problem for many computer vision applications, such as autonomous (self-driving) vehicles, robot vision, augmented reality. More generally, investigation of displacements between two (correlated) images, is a widely applied tool today.
Accordingly, several different approaches are available for generating disparity maps on the basis of stereo images.
In the approaches disclosed in U.S. Pat. No. 5,727,078, US 2008/0267494 A1, US 2011/0176722 A1, US 2012/0008857 A1, US 2014/0147031 A1 and U.S. Pat. No. 9,030,530 B2 low resolution images are generated from the stereo images recorded from a scene. Disparity analysis is performed on these low-resolution images, the disparity map obtained by this analysis is enlarged, or its accuracy is increased by gradually applied enlargements.
A separate hierarchical series of lower and lower resolution images is generated for each of the stereo images in WO 00/27131 A1 and WO 2016/007261 A1. In these approaches the disparity map is generated based on the coarsest level left and right images. Upscaling of the low-resolution disparity map is applied in these documents.
The stereo images are processed by the help of neural networks in CN 105956597 A to obtain disparity map. Neural networks are used for generating feature maps and e.g. disparity for stereo images or other type of output in the following papers (several of them are available as a pre-print in the arXiv open access database of Cornell University Library):    A. Kendall et al.: End-to-end learning of geometry and context for deep stereo regression, 2017, arXiv: 1703.04309 (in the following: Kendall);    Y. Zhong et al.: Self-supervised learning for stereo matching with self-improving ability, 2017, arXiv: 1709.00930 (in the following: Zhong);    N. Mayer et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation, 2015, arXiv: 1512.02134 (in the following: Mayer);    Ph. Fischer et al.: FlowNet: Learning optical flow with convolutional networks, 2015, arXiv: 1504.06852 (in the following: Fischer);    J. Pang et al.: Cascade residual learning: A two-stage convolutional neural network for stereo matching, 2017, arXiv: 1708.09204 (in the following: Pang);    C. Godard et al.: Unsupervised monocular depth estimation with left-right consistency, 2017, arXiv: 1609.03677 (in the following: Godard).
A disadvantage of many of the prior art approaches applying neural networks is the high complexity of the implementation. Besides, the calculation-costs are high in most of the known approaches.
In view of the known approaches, there is a demand for a method and an apparatus for generating a displacement map of a first input dataset and a second input dataset of an input dataset pair (e.g. for generating a disparity map of a stereo image pair) with which the displacement map (e.g. disparity map) can be generated in a calculation-cost effective way.