1. Field of the Invention
This invention relates to a multiresolutional filter which generates hierarchical images. In particular, this invention relates to a method for generating an image having a lower resolution using a multiresolutional filter, and an image matching method capable of using this filtering method.
2. Description of the Related Art
Automatic matching of two images, that is, correspondence between image regions or pixels, has been one of the most important and difficult themes of computer vision and computer graphics. For instance, once the images of an object from different view angles are matched, they can be used as the base for generating other views. When the matching of right-eye and left-eye images is computed, the result can immediately be used for stereo photogrammetry. When a model facial image is matched with another facial image, it can be used to extract characteristic facial parts such as the eyes, the nose, and the mouth. When two images of, for example, a man and a cat are matched exactly, all the in-between images can be generated and hence morphing can be done fully automatically.
However, in the existing methods, the correspondence of the points of the two images must generally be specified manually, which is a tedious process. In order to solve this problem, various methods for automatically detecting correspondence of points have been proposed. For instance, application of an epipolar line has been suggested to reduce the number of candidate pairs of points, but the complexity is high. To reduce the complexity, the coordinate values of a point in the left-eye image are usually assumed to be close to those of the corresponding point in the right-eye image. Providing such restriction, however, makes it very difficult to simultaneously match global and local characteristics.
In volume rendering, a series of cross-sectional images are used for constituting voxels. In such a case, conventionally, it is assumed that a pixel in the upper cross-sectional image correspond to the pixel that occupies the same position in the lower cross section, and this pair of pixels is used for the interpolation. Using this very simple method, volume rendering tends to suffer from unclear reconstruction of objects when the distance between consecutive cross sections is long and the shape of the cross sections of the objects thus changes widely.
A great number of image matching algorithms such as the stereo photogrammetry methods use edge detection. In such a method, however, the resulting matched pairs of points are sparse. To fill the gaps between the matched points, the disparity values are interpolated. In general, all edge detectors suffer from the problem of judging whether a change in the pixel intensity in a local window they use really suggests the existence of an edge. These edge detectors suffer from noises because all edge detectors are high pass filters by nature and hence detect noises at the same time.
Optical flow is another known method. Given two images, optical flow detects the motion of objects (rigid bodies) in the images. It assumes that the intensity of each pixel of the objects does not change and computes the motion vector (u,v) of each pixel together with some additional conditions such as the smoothness of the vector field of (u,v). Optical flow, however, cannot detect the global correspondence between images because it concerns only the local change of pixel intensity and systematic errors are conspicuous when the displacements are large.
To recognize the global structures, a great number of multiresolutional filters have been proposed. They are classified into two groups: linear filters and nonlinear filters. An example of the former is a wavelet. However, the linear filters are not useful when used for image matching, because the information of the pixel intensity of extrema as well as their locations are blurred. FIGS. 1(a) and 1(b) show the result of the application of an averaging filter to the facial images in FIGS. 19(a) and 19(b), respectively. FIGS. 1(k)-1(l) show the results of the application of the scaling function of the Battle-Lemarie wavelet to the same facial images. As shown in these drawings, the pixel intensity of extrema is reduced through averaging while the locations are undesirably shifted due to the influence of averaging. As a result, the information of the locations of the eyes (minima of the intensity) is ambiguous at this coarse level of resolution and hence it is impossible to compute the correct matching at this level of resolution. Therefore, although a coarse level is prepared for the purpose of global matching, the obtained global matching does not correctly match the true characteristics of the images (eyes, i.e., the minima) correctly. Even when the eyes appear clearly at the finer level of resolution, it is too late to take back the errors introduced in the global matching. By smoothing the input images, stereo information in textured regions is also filtered out as pointed out.
On the other hand, 1D sieve operators have become available as nonlinear filters which can be used for morphological operations. 1D sieve operators smooth out the images while preserving scale-space causality by choosing the minimum (or the maximum) inside a window of a certain size. The resulting image is of the same size as the original, but is simpler because small undulations are removed. Although this operator may be classified as "a multiresolutional filter" in a broad sense that it reduces image information, it is not a multiresolutional filter in a normal sense as it does not put images into hierarchy while changing the resolution of the images as wavelets do. This operator thus cannot be utilized for detection of correspondence between images.