In the technical field of computer vision, a three-dimensional (3D) content is provided to an auto-stereoscopic display for providing a 3D image with stereo visual perception.
The above-mentioned 3D image includes image plus depth information, which can also be referred to as 2D plus Z information, i.e., a 2D image with depth information. The depth information can be, for example, a depth map corresponding to the 2D image. That is, the depth information can contain depth values for each pixel in the 2D image. Based on the 2D image and the corresponding depth information, the auto-stereoscopic display can exhibit a 3D image, enabling users to perceive stereo visual experience from the generated 3D image.
In order for the auto-stereoscopic display to exhibit 3D images, depth estimation of the depth of the scene in the 2D image is performed. A conventional approach to stereoscopic vision technology estimates the depth through two images captured from the same scene and corresponding to our two eyes. Besides, there is also provided an approach to estimate the depth through multi-images captured on different view angles. Moreover, for the sake of cost reduction and operation convenience, depth estimation can also be performed on an input image, provided by a camera device with a single lens module.
In a conventional way of estimating the depth information with an input image, the input image is analyzed for image characteristic information, and a classification process is performed. In this way, scene characteristics of the input image, such as a ground area, a building, a human body, or a vehicle can be obtained and then served as the basis for determining the image depth. However, such approach is time-consuming on training to classify the input image. Hence, how to generate the corresponding depth information of one input image is still a subject of the industrial endeavor.