The present invention pertains to efficient compression, decompression, rendering, and other operations for light fields. The present invention includes techniques for spatial displacement estimation between light field images, as well as techniques for multi-resolution light field operations.
A two-dimensional view of an object or static scene taken from a single perspective lacks certain information about the object or static scene as seen from other perspectives. A single view may lack information about occluded portions of the object or static scene. Even for visible portions of the object or static scene, the single view may lack information about characteristics, such as light intensity values, that change when the perspective of the view changes. To enlarge the set of information about an object or static scene, views can depict the object or static scene from multiple perspectives. The multiple perspective views can include information about parts of the object or static scene that are occluded in a single view. The multiple perspective views can include information about characteristics of the object or static scene that change when perspective changes. Without knowledge of the spatial relationships between multiple perspective views, however, it is difficult to relate one perspective view to another or interpolate between views to create a novel view.
A light field models the light characteristics of an object or static scene, for example, by capturing light intensity and color values along a surface around a static scene. To map a light field to a computational framework requires a discrete representation. FIGS. 1 and 2 depict a discretized light field 10. Light field 10 includes a set of spatially-related light field images of an object 20. FIG. 1 shows expanded views of light field images 12 and 14. A light field image comprises a two-dimensional arrangement (s,t) of data values such as values from a color space. Light rays from the object 20 that pass through a light field image (s,t) also pass through a focal point 32 in the (u,v) plane. A (s,t,u,v) grid point is indexed with (i,j,p,q). Capture and generation of light fields, different parameterizations of light fields, and light field image rendering, as well as other aspects of light fields, are described in Gortler et al., xe2x80x9cThe Lumigraph,xe2x80x9d Computer Graphics Proceedings, Annual Conference Series, 1996, pp. 43-54 [xe2x80x9cthe Gortler referencexe2x80x9d ] and Levoy et al., xe2x80x9cLight Field Rendering,xe2x80x9d Computer Graphics Proceedings, Annual Conference Series, 1996, pp. 31-42 [xe2x80x9cthe Levoy referencexe2x80x9d].
Storage and transmission of light fields present difficulties due to the amount of digital information in a typical light field. An illustrative light field consists of 16xc3x9716 focal points in the focal plane (u,v). If each light field image has a resolution of 256xc3x97256 and stores 24-bit RGB values, the total amount of storage is: 16xc3x9716xc3x97256xc3x97256xc3x973 bytes=48 Mbytes.
Within a light field, light field images typically exhibit similarities if they are taken at adjacent locations. Therefore, there is often spatial redundancy in the light field images. Storage and transmission of the light field images is made more efficient by removing such redundancy.
In addition to the considerable storage and transmission requirements for a light field, manipulation of light field images presents considerable memory and processing requirements. Light field rendering is the process of creating a view of an object or static scene based upon a light field by interpolating from known light field image values. During light field rendering, parts of selected light field images are retrieved to construct a view from a novel perspective. Depending on the perspective of the novel view being rendered, different light field images are retrieved. Because rendering typically uses different parts of different light field images according to a complex pattern of access, random access to parts of light field images facilitates rendering. Unfortunately, loading multiple light field images into random access memory (to facilitate random access to dispersed light field samples) consumes large amounts of memory given the size of a typical light field image. Moreover, even after light field images are loaded into memory, light field operations are computationally complex, especially when decompression of the light field information is required. These high memory and processing requirements hinder real time rendering, especially for serialized rendering operations.
The complexity of rendering operations can be reduced at a cost to quality. Some quality loss may be acceptable. During periods of rapid motion between perspective views, the human eye does not perceive detail well. Other quality loss may be necessary to support real time rendering. Techniques for rendering at fixed, full resolution fail to gracefully degrade quality where efficient and acceptable, or where necessary, for real time rendering.
In view of the need for efficient storage, transmission, and manipulation of light field images, techniques are needed for compression of light fields in a way that supports rapid access to light field images at selective resolution.
To compress a light field, individual light field images can be independently compressed as still images using an intra-image coding technique. An intra-image coding technique typically uses a cascade of lossy and lossless compression techniques to reduce spatial redundancy within an image. For example, an image is transform coded into a frequency domain. The transform coefficients are then quantized and losslessly compressed. Intra-image coding techniques can yield fair compression ratios, but fail to fully exploit inter-image spatial redundancy. Moreover, intra-image coding typically yields variable bit rate output and does not facilitate efficient, rapid access to particular portions within compressed images.
The Levoy reference describes a two-stage process of vector quantization and entropy coding to reduce redundancy within a light field. During vector quantization, a light field is split into source vectors from two-dimensional light field images or four-dimensional light field portions. Source vectors are matched to a smaller number of reproduction vectors from a codebook. After vector quantization, the codebook and codebook indices are entropy coded by Lempel-Ziv coding. In decompression, the entire light field (codebook and codebook indices) is Liv-Zempel decoded, a time-consuming operation. The output (codebook and codebook indices) is loaded into random access memory. Vector dequantization occurs for light field image or light field portions as needed. The Levoy reference does not involve compression of light field images to multiple levels of resolution or describe a way to rapidly access portions of light field images at selective levels of resolution. Moreover, the compression ratio of vector quantization without entropy coding is at most 24:1. Nonetheless, the Levoy reference uses vector quantization plus entropy coding rather than predictive coding, which is described as too complicated for rapid access to light field samples.
The Gortler reference describes applying JPEG compression to selected two-dimensional light field images of a light field. The techniques described in the Gortler reference do not facilitate efficient, rapid access at selective levels of resolution to particular portions of a light field image. The Gortler reference notes the potential for compression between light field images, but lacks detail concerning estimation, compression, decompression, and reconstruction techniques for light fields.
The present invention pertains to efficient compression, decompression, rendering, and other operations for light fields. The present invention includes techniques for spatial displacement estimation between light field images, as well as techniques for multi-resolution light field operations.
According to a first aspect of the present invention, a multi-resolution representation of a light field includes plural layers. One layer is a low granularity component layer and one or more other layers are higher granularity component layers. The higher granularity component layers represent less significant information about the light field than the low granularity component layer. A selective granularity operation uses the multi-resolution representation of the light field.
One type of selective granularity operation involves separation of a light field image into plural frequency component layers. For this type of operation, granularity of the multi-resolution representation corresponds to spatial frequency. For example, bandpass filters separate a light field image into n frequency component layers. The frequency component layers are then sub-sampled by a factor of n. The bandpass filters can frequency decompose the light field image in multiple directions and to a desired level of precision. The results of frequency decomposition can then be compressed.
Another type of selective granularity operation involves spatial displacement estimation of a prediction light field image from one or more reference light field images. For this type of operation, granularity of the multi-resolution representation corresponds to the degree of spatial displacement estimation refinement. At a top level, rough spatial displacement estimates are made for groups of pieces of a prediction light field image. This rough spatial displacement estimation can be refined by residual values, by spatial displacement estimation for pieces within a group of pieces, or by selection by pieces within a group of pieces of suitable rough displacement estimates.
A third type of selective granularity operation, similar to the first type of operation, involves decompression and synthesis of frequency component layers into a light field image. Decompression of higher frequency component layer information can be conditionally bypassed if no such information exists or to compensate for a processor, memory, transmission, or other system constraint.
A fourth type of selective granularity operation, similar to the second and third types of operation, involves decompression and combination of spatial displacement estimation information. Decompression of this information can be conditionally bypassed if no such information exists or to compensate for a processor, memory, transmission, or other system constraint.
These various types of operations can be combined according to the present invention.
According to a second aspect of the present invention, a data structure stores a multi-resolution representation of a light field. The data structure includes a base field that stores low granularity information and an enhancement field that stores higher granularity information. In one use, information in the base field is decompressed and then information in the enhancement field is selectively accessed and decompressed. An array of flag values supports conditional bypassing of the selective access and decompression of enhancement information.
According to third aspect of the present invention, a transmitter transmits low granularity information for prediction and reference light field images. The transmitter then selectively transmits enhancement information for the prediction and reference light field images.
According to a fourth aspect of the present invention, a section of a prediction light field image is compressed by estimating spatial displacement from one or more reference light field images. Constraining placement and size of a search window in a reference light field image, based upon a geometrical relationship between the prediction and reference light field images, improves performance. Various other techniques also improve performance of spatial displacement estimation. These techniques include edge extension of a reference light field image, differential coding of displacement vectors, and multi-predictor spatial displacement estimation. In addition to compression by spatial displacement estimation, this aspect of the present invention also includes decompression.
According to a fifth aspect of the present invention, a configuration of reference and prediction light field images reflects the geometrical relationships between light field images. For example, a 5xc3x975 configuration of light field images includes reference light field images at the corners and prediction light field images at other locations.
In addition to applying to light fields, the present invention applies to other types of spatially-related views of an object or static scene.
Additional features and advantages of the invention will be made apparent from the following detailed description of an illustrative embodiment that proceeds with reference to the accompanying drawings.