Image fusion is a process that combines two or more source images to form a single composite image with extended information content. Typically images from different sensors, such as infra-red and visible cameras, computer aided tomography (CAT) and magnetic resonance imaging (MRI) systems, are combined to form the composite image. Multiple images of a given scene taken with different types of sensors, such as visible and infra-red cameras, or images taken with a given type of sensor and scene but under different imaging condition, such as with different scene illumination or camera focus may be combined. Image fusion is successful to the extent that: (1) the composite image retains all useful information from the source images, (2) the composite image does not contain any artifacts generated by the fusion process, and (3) the composite image looks natural, so that it can be readily interpreted through normal visual perception by humans or machines. The term useful information as determined by the user of the composite image determines which features of the different source images are selected for inclusion in the composite image.
The most direct approach to fusion is to align the source images, then sum, or average, across images at each pixel position. This and other pixel-based approaches often yield unsatisfactory results since individual source features appear in the composite with reduced contrast or appear jumbled as in a photographic double exposure.
Pattern selective image fusion tries to overcome these deficiencies by identifying salient features in the source images and preserving these features in the composite at full contrast. Each source image is first decomposed into a set of primitive pattern elements. A set of pattern elements for the composite image is then assembled by selecting salient patterns from the primitive pattern elements of the source images. Finally, the composite image is constructed from its set of primitive pattern elements.
Burt in "Fast Filter Transforms For Image Processing", in Multiresolution Image Processing And Analysis, volume 16, pages 20-51, 1981 and Anderson et al in U.S. Pat. No. 4,692,806 entitled "Image-Data Reduction Technique", incorporated herein by reference for its teaching on image decomposition technique, have disclosed an image decomposition technique in which an original comparatively high-resolution image comprised of a first number of pixels is processed to derive a wide field-of-view, low resolution image comprised of second number of pixels smaller than the first given number. The process for decomposing the image to produce lower resolution images is typically performed using a plurality of low-pass filters of differing bandwidth having a Gaussian roll-off. van der Wal in U.S. Pat. No. 4,703,514 entitled "Programmed Implementation Of Real-Time Multiresolution Signal Processing Apparatus", incorporated herein by reference, has disclosed a means for implementing the pyramid process for the analysis of images.
The Laplacian pyramid approach to image fusion is perhaps the best known pattern-selective method. P. Burt in Multiresolution Image Processing and Analysis, A. Rosenfeld, Ed., Springer Verlag, New York, 1984 first disclosed the use of image fusion techniques based on the Laplacian pyramid for binocular fusion in human vision. E. H. Adelson in U.S. Pat. No. 4,661,986 disclosed the use of the Laplacian technique for the construction of an image with an extended depth of field from a set of images taken with a fixed camera but with different focal settings. A. Toet in Machine Vision and Applications, volume 3 pages 1-11 (1990) has disclosed a modified Laplacian pyramid that has been used to combine visible and IR images for surveillance applications. More recently M. Pavel et al in Proceedings of the AIAA Conference on Computing in Aerospace, volume 8, Baltimore, October 1991 have disclosed a Laplacian pyramid for combining a camera image with graphically generated imagery as an aid to aircraft landing. Burt and Adelson in ACM Trans. on Graphics., volume 2, pages 217-236 (1983) and in the Proceeding of SPIE, volume 575, pages 173-181 (1985) have developed related Laplacian pyramid techniques to merge images into mosaics for a variety of applications
In effect, a Laplacian transform is used to decompose each source image into regular arrays of Gaussian-like basis functions of many sizes. These patterns are sometimes referred to as basis functions of the pyramid transform, or as wavelets. The multiresolution pyramid of source images permits coarse features to be analyzed at low resolution and fine features to be analyzed at high resolution. Each sample value of a pyramid represents the amplitude associated with a corresponding basis function. In the Laplacian pyramid approach to fusion cited above, the combination process selects the most prominent of these patterns from the source images for inclusion in the fused image. The source pyramids are combined through selection on a sample by sample basis to form a composite pyramid. Current practice is to use a "choose max rule" in this selection; that is, at each sample location in the pyramid source image, the source image sample with the largest value is copied to become the corresponding sample in the composite pyramid. Finally, the composite image is recovered from the composite pyramid through an inverse Laplacian transform.
In the case of the Laplacian transform, the component patterns take the form of circularly symmetric Gaussian-like intensity functions. Component patterns of a given scale tend to have large amplitude where there are distinctive features in the image of about that scale. Most image patterns can be described as being made up of edge-like primitives. The edges in turn are represented within the pyramid by collections of components patterns.
While the Laplacian pyramid technique has been found to provide good results, sometimes visible artifacts are introduced into the composite image. These may occur, for example, along extended contours in the scene due to the fact that such higher level patterns are represented in the Laplacian pyramid rather indirectly. An intensity edge is represented in the Laplacian pyramid by Gaussian patterns at all scales with positive values on the lighter side of the edge, negative values on the darker, and zero at the location of the edge itself. If not all of these primitives survive the selection process, the contour is not completely rendered in the composite. An additional shortcoming is due to the fact that the Gaussian-like component patterns have non-zero mean values. Errors in the selection process lead to changes in the average image intensity within local regions of a scene. These artifacts are particularly noticeable when sequences of composite or fused images are displayed. The selection process is intrinsically binary, the basis function from one or the other source image is chosen. If the magnitude of the basis functions vary, for example because of noise in the image or sensor motion, the selection process may alternately select the basis functions from different source images. This leads to unduly perceptible artifacts such as flicker and crawlers.
Thus there is a need for improved methods of image fusion which overcome these shortcomings in the prior art and provide better image quality in a composite image formed by the image fusion process, particularly when sequences of composite images are displayed.