The rendering of images in computer graphics involves two processes: (1) the simulation of a radiant three-dimensional scene, followed by, (2) the rendering of the scene into a two-dimensional image which models a given image formation process. Although techniques for the simulation of the 3D scene have been rapidly developing, the 2D imaging process remains based primarily on the standard camera model (or pin-hole model) and the thin-lens-with-finite-aperture model.
These camera models can produce an image containing some photographic-like effects; however in order to keep computational complexity at a minimum they are highly idealized, and as a result not suitable for simulating the behavior of a particular physical camera and lens system. The pin-hole camera model is the most idealized. It results in an image which is focused everywhere on an image plane, regardless of each object's distance from the camera. Depth of field is just one of several physical camera properties which is obviously not found in this model (depth of field relates to the property that some objects are imaged in focus while others at different distances acquire blur). Further, since the pin-hole aperture is not finite, many rays which would be sampled in a physical camera with finite aperture are not sampled by the standard camera model. Thus many camera models which supplement the pin-hole model with some post-processing of the image in order to add realistic effects cannot properly model a physical camera and lens system. For example, Postmesil and Chakravarty, Computer Graphics (SIGGRAPH '81 Proceedings), volume 15, pages 297-305, August 1981, use post-processing to simulate depth of field. After sampling the scene with a pin hole camera model the authors apply a blurring technique to the resulting image.
The thin-lens-and-finite-aperture model introduced by Cook et al., Computer graphics (SIGGRAPH '84 Proceedings), volume 18, pages 137-145, July 1984, has become the standard in the computer graphics community. Thin lens camera models can exhibit more photographic effects than the pin-hole model and in particular account for some depth of field aspects quite naturally. However, this model remains highly idealized and additional features typical of physical cameras and lens systems are not adequately simulated by this model. For example, since the film plane is assumed to be in a fixed position parallel to the lens, the thin lens model cannot capture changes in field of view, depth of field, and exposure due to the relative movement of the image surface and lens system, as occurs during focusing, nor can it model the situations in which the film plane is not parallel to the lens system as in a view camera. In addition, the use of a thin lens approximation precludes several effects. In particular, this model cannot correctly simulate the geometry of image formation in order to produce an appropriate perspective projection for a specified thick lens system, nor can it exhibit a large variety of appropriate non-ideal lens behaviors including geometric aberrations (for example, barrel distortions produced by a fisheye lens).
In the prior art, approaches to non-ideal lens image formation such as Max, Nicograph '83 Proceedings, pages 137-159, December 1983, have relied on nonlinear mappings being applied to an image generated from a pin-hole model. The mappings are extrapolated from data fitting routines and other ad-hoc or empirical methods. Max used distortion data from the manufacturer of the Omnimax lens to derive a polynomial. The polynomial was used to warp ray directions in implementing a standard thin lens model (it is noted that ray tracing through the Omnimax lens was used by the manufacturer to generate distortion data). These approaches are limited to a specific lens and the precise correspondence between the simulated image and the physical image is questionable. In addition, this approach does not include depth-of-field effects nor reproduce abberations other than distortion.
In addition to proper geometrical image formation, radiometry is an important aspect of physical-camera image creation which is not properly considered in prior art. Previous techniques do not compute exposure correctly and in particular, neglect to account for levels and variation of irradiance across the film plane. Vignetting, the blockage of wide-angle rays by the lens system's stops, is also not accounted for in prior art camera models. The work by Cook et al., Computer graphics (SIGGRAPH '84 Proceedings), volume 18, pages 137-145, July 1984 lacks both of these physical camera features.
More increasingly there is a need for rendering images which are more realistic and closely resemble images created by use of a specified lens and camera system. For example in many applications (video special effects, augmented reality, etc.) it is desirable to seamlessly merge acquired imagery with synthetic imagery. For another example, in some machine vision and scientific applications it is necessary to simulate cameras and sensors accurately; a vision system may want to test whether its internal model of the world matches what is being observed. In both of these situations it is important that the synthetic imagery be computed using a camera model that closely approximates the real camera and lens system.
In spite of the utility of a physically accurate camera model, one does not find in prior art such a model. In particular one does not find a model which combines the physical principles of ray tracing to simulate non-ideal lens image formation with radiometric principles to accurately compute film exposure. Presumably, physically accurate camera models have not been implemented in prior art systems because no implementation was known that would avoid introducing unacceptable increases in computational complexity and expense.