Light-field rendering has gained in importance in recent years with continuing advances in the field of image processing and computer graphics. With light field rendering, photo-realistic views of scenes can be created using previously digitised images, whether these images have been artificially created or are images of actual scenes, captured by a camera. For instance, a light field can be generated from a three-dimensional model of a scene, or can be created using images of an actual scene, for example, images taken by an arrangement of cameras positioned about the scene. All the images of a scene, taken at a single instant, are collectively referred to as a ‘light-field data structure’. A description of light field rendering is given by the paper “Light Field Rendering” (SIGGRAPH May 1996, Marc Levoy and Pat Hanrahan). Some well-known movies successfully combine traditional camera recording with computer-aided light-field rendering to generate complex but realistic special effects.
Light fields can be captured photographically using, for example, a hand-held camera, a remote-controlled camera, or an array of cameras mounted on a structure such as a gantry. FIG. 1 shows such an arrangement of cameras C, where the cameras are grouped to surround an area A of interest. For the sake of clarity, only a few cameras are shown. In reality, images must be generated from quite a large number of cameras, or viewpoints, for photo-realistic three-dimensional rendering of a scene. This leads to the problem of storage or transmission for the large number of images, since storage space and transmission bandwidth are expensive resources.
To store or transmit the images in a cost-effective way, the images can be compressed using some method of lossy data compression, which results in a loss of image quality that is, however, not noticeable to the viewer. An image compressed or coded using a lossy data compression algorithm can be represented using fewer bits. Several such lossy compression techniques are known. For example, the most common method used for lossy image compression is transform coding, in which a Fourier-related transform such as the Discrete Cosine Transform (DCT) is applied to the image data. One common standard for image (and audio) compression is the MPEG-2 (Motion Pictures Expert Group)
An image or frame that is compressed in its entirety, i.e. without using information obtained from other images, is often referred to as an ‘intra-coded image’, ‘intraimage’, ‘I-image’, or ‘I-frame’. Since the entire image or frame is compressed, this can be rendered again to a fairly high level of quality. However, even more bandwidth can be saved by making use of the fact that picture data in a sequence of images is often redundant. For example, a part of each frame, such as the sky, can remain the same over a sequence of frames. Evidently, this part of each image in the image sequence need only be coded once, and only those parts of an image that have changed with respect to a reference image need be coded. This type of compression is known as ‘interframe compression’ or ‘predictive coding’, and an image compressed in this way is referred to as an ‘interimage’, ‘P-image’ or ‘P-frame’. A P-image can be coded using a previous image (an I-image or a P-image) captured by the same camera. It has been shown that a good picture quality (from the viewer's point of view) can be obtained by using a compression scheme for a light-field data structure based on a trade-off between high-quality (I-images) and low cost (P-images), in which some of the images are compressed as I-images and the remainder are compressed as P-images. In order to obtain a certain level of quality in rendering, however, the I-images should be evenly distributed over the light-field data structure, which can be understood to be a virtual arrangement of the images. FIG. 2a shows an example of such a compression scheme. Here, every second image in every second row of the light-field data structure is an I-image, as indicted by the letter “I”, and the remainder of the images are coded as P-images, as indicated by the letter “P”. FIG. 2b shows another possible compression scheme. A technique for data compression using I-images and P-images is described in the paper “Data Compression for Light-Field Rendering” (Marcus Magnor and Bernd Girod, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 3, April 2000).
In some image rendering applications, for example an interactive 3-D video application, it may be that some object or item is considered to be of particular importance, for example the football in a football match. Usually, the viewer's attention would be focused on the ball. In a 3-D interactive video application rendered using images captured as described above, the user would likely want to have the scenes rendered so that this “important object” is the centre of attention. However, state-of-the-art techniques of light-field data compression do not adapt to such considerations. Using the known techniques, a certain compression scheme is chosen, for example the scheme shown in FIG. 2a, and all the light-field data structures are coded using this scheme, regardless of which images would in fact be most suited for intraimage or interimage compression. Therefore, an ‘unfavourable’ compression scheme, in which the important object is not coded using a sufficient number of I-images, might lead to a noticeable deficiency in the quality of the rendered scenes.