1. Field of the Invention
This invention relates to a method, an apparatus and a program for detecting similar time series for detecting the similarity between two time-domain signals at a high speed, and to a recording medium having such program recorded thereon.
This application claims the priority of the Japanese Patent Application No. 2002-200480 filed on Jul. 9, 2002, the entirety of which is incorporated by reference herein.
2. Description of Related Art
Up to now, such a technique which, with a view to retrieving video or audio contents, identifying broadcast commercial messages, names of the musical numbers of the broadcast music or supervising the network contents, detects the portion in unknown video or audio signals which is substantially coincident with certain video or audio signals, used as reference signals (query), has been thought to be necessary.
This task is handled as a problem in which, as shown in FIG. 18A, feature values are extracted from video or audio signals, every short time frame, and vectorized to construct feature vectors, and time domain patterns thereof are compared.
Among the methods for comparison, the so-called full search method is thought to be simplest and most universal, while being superior in reliability. This full search method consists in finding the degree of similarity to a time domain of the reference time domain signals of a partial domain in input time domain signals, matched to the time domain of the reference time domain signals, indicated by a rectangle drawn with a thick line in the drawing, by for example the correlation method or distance calculations, as shown in FIG. 18B, and sequentially executing this processing as the domain being compared is shifted by one frame at a time, as shown in FIGS. 18C and D, and detecting the signal portions, where the degree of similarity exceeds a preset threshold value, as being the coincident signal portions.
However, this full search method suffers from a problem that the calculations of the degree of similarity for a large number of components equal to the product of magnitude of the feature vectors and length of the reference time domain signals need to be carried out for the totality of the frames of the input time domain signals, resulting in considerable processing volume and retrieving time.
Thus, there is disclosed in Japanese Patent 3065314 a technique which consists in preparing a histogram of the feature values in the reference time domain signals and the corresponding location (indicated by a rectangle in thick line) in the input time domain signals and in determining the next position for comparison, that is the amount of movement of the frame in thick black line in the drawing, based on the difference, for increasing the speed of the comparison processing. Specifically, this technique resides in degrading the time axis information by preparing the histogram to eliminate the necessity for frame-by-frame comparison to speed up the processing.
However, the technique disclosed in this Japanese Patent 3065314 suffers from a problem that, since the comparison is carried out for the reduced information volume, failure in detection of the deteriorated signals or mistaken detection for different signals is produced, thus lowering the detection capability.
If, in similarity retrieval of video and audio signals, two time series are substantially coincident with each other, but different speech is superimposed on the same music, different telops are indicated on the same image, the two similar images are edited in a different manner, or one of the time series is partially deteriorated such as by frame dropout, feature vectors whose parts are completely dissimilar are inserted in basically similar video or audio signals, as shown by the shaded part in FIG. 19.
In the above-described object, that is in retrieving video or audio contents, it would be convenient if two time series with other signal components partially superposed thereon could be retrieved as being the same time series. However, the above-described conventional technique suffers from a problem that, if marked difference is present, even partially, in otherwise substantially coincident signal portions, this difference cannot be distinguished from the global difference and hence cannot be detected.