With a rapid growth of the mobile phone market, more and more people get used to taking pictures with phone cameras. The phone cameras are developed with more functionalities by using advanced computational photography techniques. For instance, iPhone 7 and Huawei's Honor 8 use a dual-camera system to simulate a shallow depth of field (DoF). Google's “Lens Blur” app achieves similar results by moving a camera. In essence, the camera captures images with different viewpoints, making it possible to yield depth from such images through exploitation of parallax. The depth information is then used to synthesize the shallow DoF. However, the quality of the produced depth map by the phone cameras is often rather poor, especially in the boundary regions. Also the camera is not able to offer an instant response to users due to a high computational cost.
To obtain high-quality depth information, the above-mentioned approaches often require a complex camera system, or a longer capturing time. To overcome these limitations, we focus on estimating depth information from a focal stack, which is already available by using a phone camera. Each time a user takes a photo with a mobile phone, the camera rapidly sweeps the focal plane through the scene to find the best auto-focus setting. The resulting set of images are called a focal stack that contains the depth information of the scene. For those phones come with a dual camera system, the captured images form a binocular focal stack.
To obtain depth from a focal stack, one conventional approach is to utilize a depth-from-focus (DfF) to exploit differentiations of sharpness at each pixel across a focal stack and assign the layer with highest sharpness as its depth. To exploit binocular cues, traditional stereo matching algorithms rely on feature matching and optimization to maintain the Markov Random Field (MRF) property: the disparity field should be smooth everywhere with abrupt changes at the occlusion boundaries. Both methods utilize optimization algorithms, e.g. graph-cut and belief propagation, to find the optimal results. However, the optimization process tends to be slow. Meanwhile, there is very few work on combing the depth from focus and disparity.
To address these issues, in this disclosure, we developed several networks to obtain the depth information from a focal stack or a binocular focal stack. Our approaches can obtain results with a higher quality and a shorter amount of time, thus, are more accurate and efficient.