1. Field of the Invention
This invention relates to a multiresolutional filter which generates hierarchical images. In particular, this invention relates to a method for generating an image having a lower resolution using a multiresolutional filter, and an image matching method capable of using this filtering method.
2. Description of the Related Art
Automatic matching of two images, that is, correspondence between image regions or pixels, has been one of the most important and difficult themes of computer vision and computer graphics. For instance, once the images of an object from different view angles are matched, they can be used as the base for generating other views. When the matching of right-eye and left-eye images is computed, the result can immediately be used for stereo photogrammetry. When a model facial image is matched with another facial image, it can be used to extract characteristic facial parts such as the eyes, the nose, and the mouth. When two images of, for example, a man and a cat are matched exactly, all the in-between images can be generated and hence morphing can be done fully automatically.
However, in the existing methods, the correspondence of the points of the two images must generally be specified manually, which is a tedious process. In order to solve this problem, various methods for automatically detecting correspondence of points have been proposed. For instance, application of an epipolar line has been suggested to reduce the number of candidate pairs of points, but the complexity is high. To reduce the complexity, the coordinate values of a point in the left-eye image are usually assumed to be close to those of the corresponding point in the right-eye image. Providing such restriction, however, makes it very difficult to simultaneously match global and local characteristics.
In volume rendering, a series of cross-sectional images are used for constituting voxels. In such a case, conventionally, it is assumed that a pixel in the upper cross-sectional image correspond to the pixel that occupies the same position in the lower cross section, and this pair of pixels is used for the interpolation. Using this very simple method, volume rendering tends to suffer from unclear reconstruction of objects when the distance between consecutive cross sections is long and the shape of the cross sections of the objects thus changes widely.
A great number of image matching algorithms such as the stereo photogrammetry methods use edge detection. In such a method, however, the resulting matched pairs of points are sparse. To fill the gaps between the matched points, the disparity values are interpolated. In general, all edge detectors suffer from the problem of judging whether a change in the pixel intensity in a local window they use really suggests the existence of an edge. These edge detectors suffer from noises because all edge detectors are high pass filters by nature and hence detect noises at the same time.
Optical flow is another known method. Given two images, optical flow detects the motion of objects (rigid bodies) in the images. It assumes that the intensity of each pixel of the objects does not change and computes the motion vector (u,v) of each pixel together with some additional conditions such as the smoothness of the vector field of (u,v). Optical flow, however, cannot detect the global correspondence between images because it concerns only the local change of pixel intensity and systematic errors are conspicuous when the displacements are large.
To recognize the global structures, a great number of multiresolutional filters have been proposed. They are classified into two groups: linear filters and nonlinear filters. An example of the former is a wavelet. However, the linear filters are not useful when used for image matching, because the information of the pixel intensity of extrema as well as their locations are blurred. FIGS. 1(a) and 1(b) show the result of the application of an averaging filter to the facial images in FIGS. 19(a) and 19(b), respectively. FIGS. 1(k)-1(l) show the results of the application of the scaling function of the Battle-Lemarie wavelet to the same facial images. As shown in these drawings, the pixel intensity of extrema is reduced through averaging while the locations are undesirably shifted due to the influence of averaging. As a result, the information of the locations of the eyes (minima of the intensity) is ambiguous at this coarse level of resolution and hence it is impossible to compute the correct matching at this level of resolution. Therefore, although a coarse level is prepared for the purpose of global matching, the obtained global matching does not correctly match the true characteristics of the images (eyes, i.e., the minima) correctly. Even when the eyes appear clearly at the finer level of resolution, it is too late to take back the errors introduced in the global matching. By smoothing the input images, stereo information in textured regions is also filtered out as pointed out.
On the other hand, 1D sieve operators have become available as nonlinear filters which can be used for morphological operations. 1D sieve operators smooth out the images while preserving scale-space causality by choosing the minimum (or the maximum) inside a window of a certain size. The resulting image is of the same size as the original, but is simpler because small undulations are removed. Although this operator may be classified as xe2x80x9ca multiresolutional filterxe2x80x9d in a broad sense that it reduces image information, it is not a multiresolutional filter in a normal sense as it does not put images into hierarchy while changing the resolution of the images as wavelets do. This operator thus cannot be utilized for detection of correspondence between images.
In view of the above, the following problems are presented.
1. Image processing methods have rarely been available for accurately identifying the characteristics of an image through relatively simple processing. In particular, effective proposals have been scarcely made in connection with a method for extracting characteristics of an image while preserving information, such as the pixel value or location of a characteristic point.
2. Automatic detection of a corresponding point based on the characteristics of an image generally has had problems including complex processing and low noise durability. In addition, various restrictions have been necessarily imposed in processing, and it has been difficult to obtain a matching which satisfies global and local characteristics at the same time.
3. Although a multiresolutional filter is introduced for recognition of the global structure or characteristics of an image, in the case of a linear filter, information regarding the intensity and location of a pixel becomes blurred. As a result, corresponding points can hardly be recognized with sufficient accuracy. In addition, the 1D sieve operator, which is a non-linear filter, does not hierarchize an image, and cannot be used for detection of a corresponding point between images.
4. With the above problems, extensive manual labor has been inevitably required in processing in order to accurately obtain corresponding points.
The present invention has been conceived to overcome the above problems, and aims to provide techniques for allowing accurate recognition of image characteristics in the image processing field.
In one aspect of the present invention, a new multiresolutional image filter is proposed. This filter is called a critical point filter as it extracts a critical point from an image. A critical point stands for a point having a certain characteristic in an image, including a maximum, where a pixel value (that is, an arbitrary value for an image or a pixel, such as a color number or the intensity) becomes maximum in a certain region, a minimum, where it becomes minimum, and a saddle point, where it becomes maximum for one direction and minimum for another. A critical point may be based on a topological concept, but it may possess any other characteristics. Selection of criteria for a critical point is not an essential matter in this invention.
In the above aspect, image processing using a multiresolutional filter is carried out. In a detection step, a two dimensional search is performed on a first image to detect a critical point. In a following generation step, the detected critical point is extracted for generation of a second image having a lower resolution than that of the first image. The second image inherits critical points from the first image. The second image, having a lower resolution than the first image, is preferably used for recognition of global characteristics of an image.
Another aspect of the invention relates to an image matching method using a critical point filter. In this aspect, source and destination images are matched. The terms a source imagexe2x80x9d and xe2x80x9ca destination imagexe2x80x9d are determined only for a discriminating purpose, and there is no essential difference between them.
In a first step of this aspect, a critical point filter is applied to a source image to generate a series of source hierarchical images each having a different resolution. In a second step, a critical point filter is applied to a destination image to generate a series of destination hierarchical images. Source and destination hierarchical images stand for a group of images which are obtained by hierarchizing source and destination images, respectively, and each consist of two or more images. In a third step, matching between source and destination hierarchical images is computed. In this aspect, image characteristics concerning a critical point are extracted and/or clarified using a multiresolutional filter. This facilitates matching. According to this aspect, matching may be totally unconstrained.
Still another aspect of the present invention relates to matching source and destination images. In this aspect, an evaluation equation is set beforehand for each of a plurality of matching evaluation items; these equations are combined into a combined evaluation equation; and an optimal matching is detected while paying attention to the neighborhood of an extreme of the combined evaluation equation. A combined evaluation equation may be defined as a linear combination or a sum of these evaluation equations, at least one of which has been multiplied by a coefficient parameter. In such a case, the parameter may be determined by detecting the neighborhood of an extreme of the combined evaluation equation or any of the evaluation equation. The above description used the term xe2x80x9cthe neighborhood of an extreme,xe2x80x9d because some error is tolerable as it does not seriously affect the present invention.
Since an extreme itself depends on the parameter, it becomes possible to determine an optical parameter based on the behavior of an extreme. Automatic determination of a parameter, which originally accompanies difficulties in tuning, is achieved.