This section is intended to provide a background to the various embodiments of the technology described in this disclosure. The description in this section may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and/or claims of this disclosure and is not admitted to be prior art by the mere inclusion in this section.
A light field is a concept proposed in the computer graphics and vision technology, which is defined as all the light rays at every point in space travelling in every direction. A light-field camera, also called a plenoptic camera, is a type of camera that uses a microlens array to capture 4D (four-dimensional) light field information about a scene because every point in the three-dimensional space is also attributed a direction. A light field cameras has microlens arrays just in front of the imaging sensor, which may consist of many microscopic lenses with tiny focal lengths and split up what would have become a 2D-pixel (length and width) into individual light rays just before reaching the sensor. This is different from a conventional camera which only uses the two available dimensions of the film/sensor. The resulting raw image captured by a plenoptic camera is a composition of many tiny images since there are microlenses.
A plenoptic camera can capture the light field information of a scene. The light field information then can be post-processed to reconstruct images of the scene from different point of views after these images have been taken. It also permits a user to change the focus point of the images. As described above, compared to a conventional camera, a plenoptic camera contains extra optical components to achieve the mentioned goal.
The plenoptic data captured by an unfocused plenoptic camera are known as the unfocused (type 1) plenoptic data, and those captured by a focused plenoptic camera are known as the focused (type 2) plenoptic data.
One exemplary algorithm for estimating the disparities of the the unfocused (type 1) plenoptic data based on block-matching was discussed in the reference written by N. Sabater, V. Drazic, M. Seifi, G. Sandri, and P. Perez, “Light field demultiplexing and disparity estimation,” HAL, 2014 (hereinafter referred to as reference 1).
In the type 2 plenoptic camera, the distance between the micro-lens and the sensor is different from the focal length of the microlenses. This configuration sacrifices the achievable angular resolution by the plenoptic type 1 configuration, for better spatial resolution.
Having several aligned views of the scene, one intuitive application of the plenoptic data is to estimate depth of the scene. Known solutions of depth estimation are usually performed by estimating the disparity of pixels between the views. Generally, the disparity d of a pixel (x,y) of a view li,j with respect to another view lk,j is estimated as the displacement of this pixel on the view lk,j. The known estimating methods are usually time consuming and not very accurate on non-textured areas. The block-matching method—as described in the reference 1—also suffers from low accuracy around edges in the scene and the results are degraded by foreground fattening effect.
One exemplary algorithm is discussed in the reference written by S. Wanner and B. Goldleuke, “Variational light field analysis for disparity estimation and super-resolution”, IEEE transaction of pattern analysis and machine intelligence, 2013 (hereinafter referred to as reference 1). In the reference 1, structure tensors (gradients) are calculated to decide which pixels will be used for estimating the disparities. However, this algorithm requires many calculations of eigenvalue decompositions on small images.