Some conventional image acquisition systems have the capacity to combine individual images for the purpose of producing composite images that detail an enlarged field of view. These image acquisition systems use methodologies that rely upon the capture of the images by one or more cameras. In order to combine the images that are captured, some conventional systems rely on the overlap of image regions of the captured source images.
The quality of a composite image is constrained by the imagery that is used in its creation. It should be appreciated that the resolution involved and the number of viewpoints that are considered are important factors that impact the creation of composite images. The greater the resolution and number of viewpoints provided the greater the spatial resolution of the resultant composite image. While digital still cameras are reaching mega-pixel dimensions at nominal cost (e.g., providing increasingly higher resolution images), the spatial resolution provided by digital video systems lags far behind that offered by digital still cameras.
Although multi-viewpoint camera systems have been in existence since the dawn of photography, most conventional image analysis is based upon single camera views. It should be appreciated, that although stereo and moving video cameras can provide more viewpoints, the actual utilization of simultaneous acquisition from a large number of perspectives remains rare as it relates to such imaging systems. A principal reason for the lower resolution and limited number of viewpoints that are conventionally employed in personal computer (PC) imaging systems is the high bandwidth necessary to support sustained data movement from numerous video sources. The data is provided to a computer memory and, eventually, to a display, at the conventional supply rate of 30 frames per second. Moreover, access to high-bandwidth multiple-stream video has been limited.
Bandwidth issues arise at the display end of conventional imaging systems as well. This is because moving large amounts of digital video severely taxes current PC architectures. Real-time display of these data requires a judicious mix across peripheral component interconnect (PCI), PCI-X, and accelerated graphics port (AGP) buses distributed over multiple display cards.
The creation of composite images (e.g., mosaicking) involves combining source images captured from a plurality of camera viewpoints. The source images are derived from viewpoint associated video streams and are used to form the composite image. A conventional approach to the creation of composite images involves finding points that correspond in the contributing images and computing stitching homographies that relate their perspectives. This approach derives from the situation where images are collected from arbitrary positions, such as in hand held capture. There, the features for deriving each homography must come from the acquired images themselves. If the camera views share a center of projection, the features can be chosen from anywhere in the overlapping images and their homographies will be valid throughout the scene viewed. However, when they don't share a projection center, the features must be collected from a shared observation plane and the homography may only produce seamless composite images for imagery in that plane.
For the reasons outlined above, conventional systems that composite images are relegated to low-resolution implementations that employ a limited number of viewpoints. The limited number of viewpoints provides a limited capacity to produce panoramas from acquired images that have high spatial resolution. The performance of conventional systems is further limited by their reliance on the use of overlapping image data to generate homographies. The requirement that the source images used to compose a composite image overlap decreases the size of the view angle that can be imaged as it prevents the imaging of non-overlapping views that can cover a wider measure of space.