Light-field image and video processing offers a much richer variety of image manipulation possibilities compared to traditional 2D images. However, capturing of high-quality light-fields is still unsolved, because a huge number of different views need to be combined with excellent image quality such as dynamic range, color fidelity and resolution.
Traditional 2D-images represent the projection of the three-dimensional world onto a two-dimensional plane. In digital images, this plane is rasterized into a grid of so called pixels. For every visible point in space, a 2D-image records the intensity of one or multiple pixels.
Stereoscopic images extend this principle by recording two different views of a scene. By showing the left captured image to the left eye and the right captured image to the right eye, a depth impression can be provided to the user. While this in theory significantly increases the visual experience, literature reports various short comings such as convergence conflicts, difficulties to adapt the content to varying screen sizes and many more
Mathematically, a light-field can be described by a five dimensional function Lλ,t(x, y, z, θ, ϕ) assigning to every point in space and to every direction a corresponding radiance. The parameters and t define the wavelength (color information) and time. Light-field imaging exceeds the previous mentioned technologies by capturing a much larger number of viewing positions of a scene. These views are typically arranged along a surface such as a plane (so called 4D light field [4] or Lumigraph [5]). Then these views do not only have different horizontal positions as for stereoscopic images, but also in vertical direction. Ideally, the individual views are spaced arbitrarily dense, such that it is possible to capture all rays from the scene traversing the chosen surface.
This huge amount of information permits much richer editing and manipulation possibilities of the captured images compared to traditional 2D technology. This includes among others the change of focal points and depths, the creation of virtual viewing positions, depth based compositing and special effects like dolly zoom [6]. A possible processing chain is described in [7].
However, the capture of the light-field such that it has sufficient quality remains an unsolved problem which will be addressed in the following invention.
There are two fundamental techniques to capture a light-field. On the one hand, there exist a variety of plenoptic cameras [8, 9, 10]. Compared to traditional cameras, they introduce an additional array of so called micro lenses between the main lens and the sensor. By these means, it is indeed possible to capture different viewing positions. However, they still remain rather similar. Moreover, because of the small size of the micro lenses, high quality imaging reaching digital cinema quality is still not solved.
On the other hand, light-fields can be acquired by means of multi-camera arrays [11, 12, 13]. Given that many different views are necessitated in order to avoid artifacts when performing image manipulations, the extensions of the cameras need to be rather small. Moreover, typically cameras with reduced costs are used in order to make the overall system affordable.
However, because of limited size and costs, the image quality provided by these cameras cannot reach the highest quality level that is technologically possible today. For instance, the quality of color reproduction, dynamic range and signal to noise ratio is much worse for small sized and cheap cameras compared to professional devices used in digital cinema movie productions. Given that these cinema cameras are large and expensive, their combination to large multi-camera arrays for light field acquisition is prohibitive. As a consequence, applications with highest quality requirements cannot be served by the light-field technology, although the resulting editing possibilities would be highly welcome. The same drawback holds for all applications where due to cost reasons a single 2D camera cannot be replaced by a multitude of them in order to capture a light-field.