This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present disclosure that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
Conventional image capture devices render a three-dimensional scene onto a two-dimensional sensor. During operation, a conventional capture device captures a two-dimensional (2-D) image representing an amount of light that reaches a photosensor (or photodetector) within the device. However, this 2-D image contains no information about the directional distribution of the light rays that reach the photosensor (which may be referred to as the light field). Depth, for example, is lost during the acquisition. Thus, a conventional capture device does not store most of the information about the light distribution from the scene.
Light field capture devices (also referred to as “light field data acquisition devices”) have been designed to measure a four-dimensional (4D) light field of the scene by capturing the light from different viewpoints of that scene. Thus, by measuring the amount of light traveling along each beam of light that intersects the photosensor, these devices can capture additional optical information (information about the directional distribution of the bundle of light rays) for providing new imaging applications by post-processing. The information acquired/obtained by a light field capture device is referred to as the light field data. Light field capture devices are defined herein as any devices that are capable of capturing light field data. There are several types of light field capture devices, among which:                plenoptic devices, which use a microlens array placed between the image sensor and the main lens, as described in document US 2013/0222633;        a camera array, where all cameras image onto a single shared image sensor.        
The light field data may also be simulated with Computer Generated Imagery (CGI), from a series of 2-D images of a scene each taken from a different viewpoint by the use of a conventional handheld camera.
Light field data processing comprises notably, but is not limited to, generating refocused images of a scene, generating perspective views of a scene, generating depth maps of a scene, generating extended depth of field (EDOF) images, generating stereoscopic images, and/or any combination of these.
The present disclosure focuses more precisely on light field based image captured by a plenoptic device as illustrated by FIG. 1 disclosed by R. Ng, et al. in “Light field photography with a hand-held plenoptic camera” Standford University Computer Science Technical Report CSTR 2005-02, no. 11 (April 2005).
Such plenoptic device is composed of a main lens (11), a micro-lens array (12) and a photo-sensor (13). More precisely, the men lens focuses the subject onto (or near) the micro-lens array. The micro-lens array (12) separates the converging rays into an image on the photo-sensor (13) behind it.
A micro-image (14) is the image formed on the photo-sensor behind a considered micro-lens of the micro-lens array (12) as illustrated by FIG. 2 disclosed by http://www.tgeorgiev.net/ where the image on the left corresponds to raw data and the image on the right corresponds to details of micro-images representing in particular a seagull's head. Micro-images resolution and number depend on micro-lenses size with respect to the sensor. More precisely, the micro-image resolution varies significantly depending on devices and applications (from 2×2 pixels up to around 100×100 pixels).
Then, from every micro-image, sub-aperture images are reconstructed, such a reconstruction consists in gathering collocated pixels from every micro-image. The more numerous the micro-lenses, the higher the resolution of sub-aperture images. More precisely, as illustrated by FIG. 3, considering that one micro-lens overlaps N×N pixels of the photo-sensor (15), the N×N matrix of views (17) is obtained by considering that the ith view contains all the L×L ith pixels overlapped by each micro-lens of the micro-lens array (16) comprising L×L micro-lenses, where “×” is a multiplication operator.
More precisely, on FIG. 3, L=8 and N=4, the first view 300 will thus comprises the first of the sixteen pixels covered by each micro-lens of the 64 micro-lenses of the considered micro-lens array.
Sub-aperture images reconstruction required de-mozaicing. Techniques for recovering the matrix of views from raw plenoptic material are currently developed such as the one disclosed by N. Sabater et al. in “Light field demultiplexing and disparity estimation” International Conference on Complementary Problems ICCP 2014.
On the opposite to the plenoptic device, camera array devices, such as the Pelican Imaging® camera, deliver directly matrices of views (i.e. without de-mozaicing).
State of Art methods for encoding such light field based images consists in using standard image or video codecs (such as JPEG, JPEG-2000, MPEG4 Part 10 AVC, HEVC). However, such standard codecs are not able to take into account the specificities of light field imaging (aka plenoptic data), which records the amount of light (the “radiance”) at every point in space, in every direction.
Indeed, applying the conventional standard image or video codecs (such as JPEG, JPEG-2000, MPEG4 Part 10 AVC, HEVC) delivers conventional imaging formats.
However, among the many new light field imaging functionalities provided by these richer sources of data, is the ability to manipulate the content after it has been captured; these manipulations may have different purposes, notably artistic, task-based and forensic. For instance, it would be possible for users to change, in real time, focus, field of depth and stereo baseline, as well as the viewer perspective. Such media interactions and experiences are not available with conventional imaging formats that would be obtained by using the conventional standard image or video codecs to encode/decode light field based images.
It would hence be desirable to provide a technique for encoding/decoding light field based images that would not show these drawbacks of the prior art. Notably, it would be desirable to provide such a technique, which would allow a finer rendering of objects of interest of decoded images obtained from light field based images.