Image filing systems by which documents are converted into images with image input apparatuses, such as image scanners, and electrically stored to allow the documents to be retrieved afterwards have come to practical use. To retrieve the documents scanned in the form of images, techniques for retrieving images can be utilized.
Conventional image retrieval techniques include retrieval based on text given to images and retrieval based on visual content of images.
Retrieval based on text given to images is carried out as follows. Text information explaining the images are created as information that is to be associated with the images. The images are retrieved by use of the text information as keywords. Conventional techniques of this retrieval are described in Publications 1 and 2, for example.
It is, however, currently impossible to automatically give text to images by computer visions and artificial intelligence technologies, although they have been developed every day. Thus, the text needs to be given manually for the retrieval based on text, which requires bothersome work.
Moreover, since the text is given manually, there is a chance of including subjective points of view of human being. Thus, the text given may differ in meaning. Further, there is no established rule as to how the text is to be given to the images. Thus, interpretation of keywords may vary to some extent. Therefore, an image that is obtained as a result of the retrieval may not always be the image targeted by the user, which affects the accuracy of the retrieval.
Further, the image retrieval based on text does not utilize visual features (e.g. colors, patterns) of images. Thus, it cannot be said that sufficient information on the images is presented.
On the other hand, the retrieval based on visual content of images is carried out as follows. An image is retrieved on the basis of features of the image. Since the image is retrieved by use of images, no manual input of text is necessary. Thus, no bothersome work is necessary. Further, there is no chance of including subjective points of view of human being.
The following three are generally-used features of images: color feature, pattern feature, and shape feature.
The color feature is an overall attribute of an image. Surface characteristics of the image are described by use of information on colors of the image. Conventional techniques thereof are described in Publications 3 and 4, for example.
The pattern feature reflects features of local structures of the image, and describes surface characteristics of the image. Local statistical calculation needs to be carried out in sections containing plural pixel points. Conventional techniques thereof are described in Publications 5 and 6, for example.
Regarding the shape feature, division of an image and recognition of a section are carried out on a particular section of the image, and then the shape feature is extracted. Conventional techniques thereof are described in Publications 7 and 8, for example.    Publication 1: Specification of Chinese Patent Application Publication No. 1851713 “Multi-image-text based image search and display method”    Publication 2: Specification of Chinese Patent Application Publication No. 1402853 “Image retrieval system and image retrieval method”    Publication 3: Specification of Chinese Patent Application Publication No. 1365067 “Image retrieval method based on color and image characteristic combination”    Publication 4: Specification of Chinese Patent Application Publication No. 1426002 “Image research method and device not affected by light change”    Publication 5: Specification of Chinese Patent Application Publication No. 1570972 “An image retrieval method based on image grain characteristic”    Publication 6: Specification of Chinese Patent Application Publication No. 1342300 “Texture description method and texture-based image retrieval method using Gabor filter in frequency domain”    Publication 7: Specification of Chinese Patent Application Publication No. 1570969 “An image retrieval method based on marked interest point”    Publication 8: Specification of Chinese Patent Application Publication No. 1570973 “An image retrieval method using marked edge”
However, the conventional retrievals based on visual content of images have the following problems.
The images are always affected by noise, which results in deformation and distortion of the images. In the retrieval methods using patterns and colors as the features of the image, ideal retrieval results are not obtainable if deformation or distortion occurs.
Further, colors are not responsive to changes in direction or size of the sections of the image. Thus, the local features cannot be captured properly by use of the color features.
Especially Publication 4 considers impacts of lighting on color information at the time of forming the image. In Publication 4, low luminance and its own luminance are eliminated, and the remaining pixels are converted into colors under a standard luminance space. However, the determined standard value of the low luminance, the way to determine the value of the low luminance, and selection of the limit value of its own luminance directly affect the subsequent processes on the image, and therefore give a significant impact on the results of the retrieval.
Further, it is simply impossible in the retrieval method using pattern features to obtain high-level image content. Furthermore, if the image changes in resolution, a significant deviation may occur in the pattern obtained by calculation using it. Moreover, if the image is affected by application of light or condition of reflection, false patterns may be formed, which gives wrong ideas.
On the other hand, the retrieval based on the shape feature of the image is not so effective for images having vague outlines. Normally, information on a target shape that the extracted shape feature reflects is not exactly identical to human intuitive perception. Therefore, it is not easy to establish a perfect mathematical model. Thus, it is difficult to determine the features, and much calculation time and memory are required. Further, changes in shape of the image causes a significant decrease in accuracy.
Especially Publication 8 is applied to images having sharp edges. Thus, it is easy to capture the edges of the image of the image document containing mainly text and a graphic, which document is what the present invention focuses on. However, it is difficult to define the edges in such a way as to describe the full image.