A compound camera consists of a set of component cameras, a data processor and image processing software that runs on the data processor. The component cameras may be synchronized through wired or wireless electronic signals. Individual images from the component cameras are transmitted to the data processor through wired or wireless connections. The image processing software takes images from the component cameras as input and synthesizes an output image following the specifications of a virtual camera.
A conventional compound camera may be implemented in a number of ways. In a first conventional embodiment, a compound camera may comprise a number of synchronized regular video cameras and a separate microprocessor connected to the video component cameras. In a second conventional embodiment, a plurality of component image sensors and a microprocessor may be integrated on one substrate, such as a printed circuit board (PCB) or a hybrid substrate. Synchronization and communication are accomplished through the printed circuit connections on the substrate. In a third conventional embodiment, the component image sensors and the microprocessor are very small and are integrated on a single silicon chip.
The physical model of a camera consists of a shutter, a lens and an image plane. The shutter has an opening called an aperture that lets light enter into the camera. A bundle of light rays coming out of a point on an object surface enters through the aperture, is refracted by the lens, and is gathered and focused on the image plane, where the color of the object point is recorded.
For a certain aperture size, there is a range of depth within which the image is sharp. This is called the “depth-of-field” and it is inversely proportional to the aperture size. The image plane slides back and forth to search for the best overall image within the range of the depth-of-field. Normally, large depth-of-field coverage is desired. This, in turn, requires high sensitivity from the sensor because the aperture size is proportionally small.
Traditional cameras rely on complex optical and mechanical components to realize the change of focus and aperture. Physical conditions limit the maximum resolution a camera can achieve. In a compound camera, these features may be implemented digitally by running the image processing software on the microprocessor.
However, the prior art conventional compound camera image processing systems mainly focus on two areas. In computer vision, the common practice is to first recover the 3-dimensional geometry of the objects in the scene. This is called structure-from-motion. Next, the input images are transferred to the virtual camera via the recovered geometry. A good reference is Olivier Faugeras, “Three Dimensional Computer Visions—A Geometric Viewpoint,” The MIT Press, 1996. The disclosure of the Faugeras text is hereby incorporated by reference for all purposes as if fully set forth herein. The problem of this approach is that the reconstructed geometry normally is not very accurate, especially on object surfaces that lack color texture. This result in visible artifacts in the synthesized image.
In computer graphics, the light field approach can be thought of as using only one depth plane. A good reference for the light field approach is M. Levoy and P. Hanrahan, “Light Field Rendering,” Proceedings of the ACM SIGGRAPH 96”, pp. 31-42, 1996. The disclosures of the Levoy and Hanrahan text is hereby incorporated by reference for all purposes as if fully set forth herein. However, in the light field approach, in order to deal with blur, the component cameras must be densely placed. Densely placed cameras normally imply a large number of cameras. A large number of cameras, in turn, produce a large amount of data to be processed. This vastly increases the cost and complexity of the image processing system.
Therefore, there is a need in the art for improved apparatuses and methods for processing video images. In particular, there is a need for image processing systems that implement improved auto-focus, high-resolution, and depth-of-field functions.