1. Field of the Invention
The present invention relates to a stereo image processing apparatus and a stereo image processing method, and a recording medium in which a program for processing stereo image is recorded, and particularly relates to a method of automatically generating three-dimensional data from satellite stereo image or aerial stereo image.
2. Description of the Prior Art
Conventionally, in this type of method of automatically generating three-dimensional data, methods of generating three-dimensional data indicative of geographical features [DEM (Digital Elevation Map) Data] by stereo matching based on images obtained from an artificial satellite and aircraft are widely used, and methods in which operators are involved in correcting points for which correspondence cannot be obtained have also been proposed.
Here, the stereo matching processing is to determine corresponding points in images having the same point imaged therein, for two images obtained by photographing an object from different viewpoints, namely stereo images, and, by using their parallax to determine the depth dimension and form up to the target according to the principle of triangulation technique.
A various methodologies have already been proposed for this stereo matching processing. For example, Japanese Patent Publication No. 8-16930 specification discloses a methodology of using an area correlation method that is widely used in general. This area correlation method is a method in which a correlation window is set in the left image as a template, a search window in the right image is moved to calculate a mutual coefficient of correlation with the template, and a point with a high coefficient is searched with this coefficient considered as a matching degree, thereby obtaining a corresponding point.
In the method described above, the range in which the search is moved is limited to the direction of epipolar lines in the image for alleviating processing, whereby the magnitude of displacement in the x direction of the corresponding point in the right image, namely a parallax can be obtained for each point in the left image. Here, the epipolar line is a line that can be drawn, for a point in one image of stereo images, as a range in which a point corresponding to such a point in the other image exists. The epipolar line is described in “Image Analysis Handbook” (edited by Mikio Takagi and Hirohisa Simoda, Tokyo Univ. Press, Jan. 1991, pp 597–599).
The direction of epipolar lines is usually different from the direction of scan lines in the image, but coordinate transformation is performed, whereby the direction of epipolar lines can be made to match with the direction of scan lines to make rearrangement. The method of coordinate transformation is described in the above described “Image Analysis Handbook”.
In the stereo image rearranged as described above, the range in which the search window of the corresponding point is moved can be restricted on the scan line, and thus the parallax is obtained as a difference in x coordinates between corresponding points in the left and right images.
In FIG. 5 is shown an example of two satellite images obtained by stereo-photographing the same point from different viewpoints. Furthermore, for the satellite image, usually a rather wide range of area is photographed, but for simplifying description, the image is shown with apart thereof magnified, and several buildings are seen around the roads intersecting each other in the center of the image.
When comparing the left and right images in two satellite images obtained by stereo-photographing an object, the rooftop surfaces of buildings are imaged in the positions displaced depending on their respective heights, while the position of a road is substantially the same in both images. For example, a building a in the left image shown in FIG. 5 corresponds to a building b in the right image, but this building b drawn in the same position in the left image corresponds to a building b′. For the rooftop surfaces of these buildings a and b′, the magnitude of displacement c in each x coordinate is a parallax.
When the parallax obtained through the above described processing is visualized as a pixel value, the image is dark for the ground surface of a road with no parallax, while the image is bright depending on the height of the building for the rooftop of the building, as shown in FIG. 6A. Cross sections of those buildings with pixel values showing parallaxes (DEM data values showing heights) in the dot line in FIG. 6A plotted on the longitudinal axis are shown in FIG. 6B. Height information corresponding to a structure on the rooftop of the building can be obtained from FIG. 6B. If information of imaging points and visual angles obtained from these images are used, the height per pixel for parallax can be known, and thus three-dimensional data showing geographical features around the photographed point is obtained from the above described image.
For methods of generating three-dimensional data by conventional stereo matching, however, since areas with no textures and areas in which correspondence by coefficients of correlation cannot be obtained are also included, a large number of points indicating erroneous heights that are very different from surroundings are included in the image of the above described three-dimensional data. Particularly, since hiding occurs around a building and so oil, the number of points for which correspondence cannot be obtained may become large, which may result in cases where extremely high values are shown, or the building is significantly damaged.
Thus, for methods of generating three-dimensional data by conventional stereo matching, errors due to mismatching of corresponding points are caused, and accurate three-dimensional information cannot be obtained, thus raising a disadvantage that application to complicated images with a large number of buildings such as those of urban areas is difficult.
Also, because a satellite image or aerial photograph has usually enormous amounts of image data, there is also a disadvantage that modification operations by operators are difficult and complicated. In view of these problems, a method of generating and processing three-dimensional data by which sufficiently accurate information can be obtained even from images of urban areas and processing can be performed automatically is desired.