1. Field of the Invention
The present invention relates to a method of and an apparatus for determining whether a query signal and a search signal are similar to each other or not in similar image or audio retrieval processes.
2. Description of the Related Art
Similar signal retrieval techniques for retrieving signal data similar to a given signal (query signal) from signal data stored in a signal database which contains image or audio signal data are used for similar image retrieval, for example.
Many various and similar signal retrieval techniques have been proposed in the art. The proposed similar signal retrieval techniques mainly use characteristics representing overall color information of an image or a collection of characteristics representing local color information of an image.
Most of the similar signal retrieval techniques which use overall color information as image characteristics employ a process of calculating a color histogram which is representative of color layouts of pixels contained in an image and retrieving a similar image from the similarity of the color histogram. The color histogram refers to image characteristics comprising colors of an image and their ratios. Color histogram information may include a histogram of all colors contained in an image and a histogram of one or plural representative colors of an image and their ratios. The above similar signal retrieval techniques, however, are disadvantageous in that they do not reflect a spatial arrangement of the colors of an image, i.e., they do not reflect a layout structure of the colors of an image.
One image retrieval system is disclosed in U.S. Pat. No. 5,579,471 which uses a collection of characteristics representing local color information of an image for thereby taking into account a spatial arrangement of the colors of an image. In one mode of operation of the disclosed image retrieval system, each of the images contained in a database is divided into blocks. Each of the blocks is grouped into subsets of similar colors, and the greatest one of the subsets is selected. The average color of the selected subset is selected as the representative color of the block.
Japanese laid-open patent publication No. 2000-259832 discloses an image retrieval apparatus which uses image characteristics representing a spatial arrangement of colors which is expressed on a frequency axis. The image retrieval apparatus employs image frequency conversion coefficients having values indicative of energies in respective frequency bands, the values being obtained by dividing the frequency distribution of the spectrum of the color (average color) of an image into the frequency bands and analyzing the frequencies in the frequency bands using orthogonal matrixes.
FIG. 1 of the accompanying drawings shows in block form an arrangement of the disclosed image retrieval apparatus. As shown in FIG. 1, the image retrieval apparatus has image characteristic generator 102 for generating characteristic 103 from image data 101, image characteristic memory 104 storing characteristics in advance therein, and similarity calculator 106 for calculating similarity 107 between characteristic 103 and characteristic 105 stored in image characteristic memory 104. Image characteristic generator 102 comprises image size converting means 110 for generating image 111 of fixed size from image data 101, frequency analyzing means 112 for analyzing frequencies of image 111, and DC component/partial AC component extracting means 114 for extracting DC components and partial AC components of frequency conversion coefficients 113 produced by frequency analyzing means 112 as characteristic 103.
For increasing the accuracy of retrieval in the image retrieval apparatus, it is necessary to increase the accuracy with which to determine a similarity between images. Increasing the accuracy with which to determine a similarity between images may be based on using both a characteristic representing a color layout and a characteristic representing a color histogram. If a DCT (Discrete Cosine Transform) coefficient is used as a color histogram characteristic and a color histogram is used as a color layout characteristic, then a similarity between images is determined as follows: First, the distance between the DCT coefficient of a query image and the DCT coefficient of a search image, and the distance between the color histogram of the query image and the color histogram of the search image are calculated. Then, the calculated distances are added together, and a similarity between the query image and the search image is determined on the basis of the sum of the distances.
Since the DC component of the DCT coefficient represents color information as the average color of an image in the above example, the DCT coefficient indicates not only the general color histogram of the image, but also the representative color of the image. Therefore, when the distance between the DCT coefficients indicative of representative colors and the distance between the color histograms are added together, the determined similarity largely reflects a similarity between the representative colors. Because the color layout characteristic and the color histogram characteristic are not necessarily characteristics of entirely different natures, even if the distance between color layout characteristics and the distance between color histogram characteristics are added to each other for comparison therebetween, and a similarity is determined based on the sum of the distances, the determined similarity includes an emphasized element that represents a property shared by both types of the characteristics, decreasing the accuracy of retrieval.
Characteristics of audio signals are also subject to similar limitations. A audio signal interval similar to a audio signal interval having a certain length is retrieved by using both a frequency distribution characteristic extracted from the entire audio signal interval and a collection of frequency distribution characteristics extracted from respective divided segments of the audio signal interval. Since those types of the frequency distribution characteristics are not necessarily characteristics of entirely different natures, even if the distance between frequency distribution characteristics of one type and the distance between frequency distribution characteristics of the other type are added to each other, and a similar audio signal interval is retrieved on the basis of the sum of the distances, the determined similarity includes an emphasized element that represents a property shared by both types of the characteristics.