It is challenging to obtain visual situational awareness of objects and people that are partially occluded. These occlusions may be the result of branches, dense foliage or other objects. In most of these cases different parts of the occluded object or person can be observed by moving the camera from one position to the other. However, by moving the camera around parts that were visible in earlier positions of the camera will be lost. The lack of a complete and easily interpretable image makes it more difficult or even impossible to automatically detect objects like vehicles or recognize human faces.
To deal with these kind of challenges state-of-the-art computer vision research focuses on camera arrays (typically>60 cameras). The results are very promising; however, the camera arrays developed thus far have large physical dimensions.
Synthetic Aperture (Sa) Imaging in General
Large aperture cameras typically have a small depth of field, meaning that there is a reasonably small depth range at which objects are imaged sharply. Scene objects that are out of focus are blurred across the image. This effect increases with increasing aperture sizes. In order to use this effect to the extent of “removing” out-of-focus objects, a very large aperture is needed. Fortunately, this is possible by creating a synthetic aperture (SA) which may be constituted by a (e.g. matrix) array of cooperating cameras, as illustrated in FIG. 1. A synthetic aperture method creates a synthesized output image from the images of the cameras by means of a process of registration, the synthesized output image corresponding to an image of a virtual camera with a larger aperture size than those of the cooperating cameras. A synthesized output image of a synthetic aperture method typically has a more limited depth of focus than the images of the cooperating cameras, at a selectable focus distance from the virtual camera.
The array of cameras forms an imaging system having a large aperture. A single point on the chosen virtual focal plane is projected to different image plane positions in the cameras. The camera images have to be co-registered in order to map these positions to the same point. The average of all the camera images then forms the output image.
A partial occluding object before the focal plane will not be projected sharply on the image plane; different occluding parts are projected to the same point. As some cameras can see the scene point through the occlusion, the resulting image point is made from one or more scene points plus varying occluding object points. As a result, the scene point will dominate in the output image. FIG. 1 illustrates the creation of an SA by the use of multiple cameras (image taken from [1]). The camera images are combined in such a way that scene points from the focal plane are projected to the same positions in the output image. A partial occluding object that is positioned before the focal plane appears blurred in the output image.
Synthetic Aperture Imaging Using a Camera Array
In order to determine which image pixels belong to certain focal plane positions, a registration procedure is required. In general, one of the cameras is chosen as reference view to which all other views are registered. A proven method for registration is to place a calibration target at the desired focal plane position, and determine where the plate positions appear in the camera images. The cameras will then be calibrated for a fixed focal plane position. For each camera a projective transformation (homograph) is calculated, that maps the camera view to the chosen reference view. The camera images can then be transformed and averaged to form the output image.
It is possible to set the focal plane to a different position by shifting the images with an amount proportional to their mutual distances. In FIG. 2 can be seen that the focal plane position depends on the camera spacings when the mappings of the views remains fixed. Changing these spacings amounts to changing the depth of the focal plane. However, instead of changing the camera spacings, the view mappings can also be shifted [3]. This is the easiest way for varying the focal plane distance. FIG. 2 illustrates that changing the focal plane position for a camera array is equivalent to changing mutual camera spacings under fixed image shifts. In practice, the camera array can remain fixed while the camera images are shifted instead.
When the focal plane position is varied in this way, it is possible to “step” through a recorded image set and visualize all scene elements at specific distances. An example is given in FIG. 4 for the input images from FIG. 3. Here, a camera is looking at a bush behind which a person is hiding. Due to the occlusion of the bush, the person is not visible in any of the example input images of FIG. 3. However, SA images can be constructed from the input imagery, using different focal plane distances. For example, the focal plane position selected for image at top right in FIG. 4, leads to an image that is only sharp for the distance at which the checkerboard object was placed. Furthermore, the focal plane position selected for image at bottom left in FIG. 4 reveals that a person is hiding behind the bush. Synthetic aperture techniques can therefore be used to improve the visibility of objects that are hidden behind partial occlusions.
Prior-art SA techniques require large camera array's. State-of-the-art camera array's systems typically use>60 cameras. For example, the Stanford camera array consists out of 88 cameras [1]. Although the results are very promising, the large physical dimensions are disadvantageous for several applications, e.g. in airplanes etc.