Description of the Related Art
In a conventional camera, the main lens maps the 3D world of the scene outside camera into a 3D world inside camera. FIG. 1 illustrates imaging in a conventional camera. “Inside world” represents inside the camera. The shaded oval regions represent the order of depths in the outside world, and the corresponding depths inside the camera. One particular image plane inside the camera is shown. The mapping of the 3D world of the scene outside camera into a 3D world inside camera is governed by the lens equation:
            1      A        +          1      B        =      1    F  where A and B are respectively the distances from the lens to the object plane and from the lens to the image plane. This equation is normally used to describe the effect of a single image mapping between two fixed planes. In reality, however, the lens equation describes an infinite number of mappings—it constrains the relationship between, but does not fix, the values of the distances A and B. That is, every plane in the outside scene (which is described as being at some distance A from the objective lens) is mapped by the objective lens to a corresponding plane inside of the camera at a distance B. When a sensor (e.g., conventional film, a charge-coupled device (CCD), etc.) is placed at a distance B between F and ∞ (infinity) inside the camera, the sensor captures an in-focus image of the corresponding plane at A that was mapped from the scene in front of the lens.
Conventional cameras render a three-dimensional scene onto a two-dimensional sensor. During operation, a conventional digital camera captures a two-dimensional (2-D) image representing a total amount of light that strikes each point on a photosensor within the camera. However, this 2-D image contains no information about the direction of the light that strikes the photosensor. The image captured by a conventional camera essentially integrates the radiance function over its angular portion, resulting in a two-dimensional intensity as a function of position. The angular information of the original radiance is lost. Thus, conventional cameras fail to capture a large amount of optical information.
Light-Field or Radiance Capturing Cameras
In contrast to conventional cameras, light-field, or radiance capturing, cameras sample the four-dimensional (4-D) optical phase space or light-field, and in doing so capture information about the directional distribution of the light rays. This information captured by light-field cameras may be referred to as the light-field, the plenoptic function, or radiance. In computational photography, a light-field is a 4-D record of all light rays in 3-D. Radiance describes both spatial and angular information, and is defined as density of energy per unit of area per unit of stereo angle (in radians). A light-field camera captures radiance; therefore, light-field images originally taken out-of-focus may be refocused, noise may be reduced, viewpoints may be changed, and other light-field effects may be achieved.
Light-fields, i.e. radiance, may be captured with a conventional camera. In one conventional method, M×N images of a scene may be captured from different positions with a conventional camera. If, for example, 8×8 images are captured from 64 different positions, 64 images are produced. The pixel from each position (i, j) in each image are taken and placed into blocks, to generate 64 blocks. FIG. 2 illustrates an example prior art light-field camera, or camera array, which employs an array of two or more objective lenses 110. Each objective lens focuses on a particular region of photosensor 108, or alternatively on a separate photosensor 108. This light-field camera 100 may be viewed as a combination of two or more conventional cameras that each simultaneously records an image of a subject on a particular region of photosensor 108 or alternatively on a particular photosensor 108. The captured images may then be combined to form one image.
FIG. 3 illustrates an example prior art plenoptic camera, another type of radiance capturing camera, that employs a single objective lens and a microlens or lenslet array 106 that includes, for example, about 100,000 lenslets. In a conventional plenoptic camera 102, lenslet array 106 is fixed at a small distance (˜0.5 mm) from a photosensor 108, e.g. a charge-coupled device (CCD). In conventional plenoptic cameras, the microlenses are placed and adjusted accurately to be exactly at one focal length f from the sensor 108. This is done by placing the array of microlenses at distance f from the sensor, where f is the focal length of the microlenses. Another way to say this is that, for the microlenses, f is chosen to be equal to the distance to the photosensor 108. In other words, the microlenses are focused on infinity, which is essentially equivalent to focusing the microlenses on the main lens 104, given the large distance of the microlenses to the main lens relative to the focal length of the microlenses. Thus, the raw image captured with plenoptic camera 102 is made up of an array of small images, typically circular, of the main lens 108. These small images may be referred to as microimages. However, in conventional plenoptic camera 102, each microlens does not create an image of the internal world on the sensor 108, but instead creates an image of the main camera lens 104.
The lenslet array 106 enables the plenoptic camera 102 to capture the light-field, i.e. to record not only image intensity, but also the distribution of intensity in different directions at each point. Each lenslet splits a beam coming to it from the main lens 104 into rays coming from different locations on the aperture of the main lens 108. Each of these rays is recorded as a pixel on photosensor 108, and the pixels under each lenslet collectively form an n-pixel image. This n-pixel area under each lenslet may be referred to as a macropixel, and the camera 102 generates a microimage at each macropixel. The plenoptic photograph captured by a camera 102 with, for example, 100,000 lenslets will contain 100,000 macropixels, and thus generate 100,000 microimages of a subject. Each macropixel contains different angular samples of the light rays coming to a given microlens. Each macropixel contributes to only one pixel in the different angular views of the scene; that is, only one pixel from a macropixel is used in a given angular view. As a result, each angular view contains 100,000 pixels, each pixel contributed from a different macropixel. Another type of integral or light-field camera is similar to the plenoptic camera of FIG. 3, except that an array of pinholes is used between the main lens and the photosensor instead of an array of lenslets.
FIG. 4 further illustrates an example prior art plenoptic camera model. In conventional plenoptic camera 102, the microlens-space system swaps positional and angular coordinates of the radiance at the microlens. For clarity, only the rays through one of the microlenses are illustrated. The conventional optical analysis of such a plenoptic camera considers it as a cascade of a main lens system followed by a microlens system. The basic operation of the cascade system is as follows. Rays focused by the main lens 104 are separated by the microlenses 106 and captured on the sensor 108. At their point of intersection, the rays have the same position but different slopes. This difference in slopes causes the separation of the rays when they pass through a microlens-space system. In more detail, each microlens functions to swap the positional and angular coordinates of the radiance, then this new positional information is captured by the sensor 108. Because of the swap, it represents the angular information at the microlens. As a result, each microlens image captured by sensor 108 represents the angular information for the radiance at the position of the optical axis of the corresponding microlens.
The light-field is the radiance density function describing the flow of energy along all rays in three-dimensional (3D) space. Since the description of a ray's position and orientation requires four parameters (e.g., two-dimensional positional information and two-dimensional angular information), the radiance is a four-dimensional (4D) function. This function may be referred to as the plenoptic function. Image sensor technology, on the other hand, is only two-dimensional, and light-field imagery must therefore be captured and represented in flat (two dimensional) form. A variety of techniques have been developed to transform and capture the 4D radiance in a manner compatible with 2D sensor technology. This may be referred to as a flat representation of the 4D radiance (or light-field), or simply as a flat.