1. Field of the Invention
The present invention relates to a sound retrieving technology for retrieving a sound data desired by the user on the basis of sound information and subjective impressions over the sounds data. More particularly, the present invention relates to a sound feature extracting apparatus, a sound data registering apparatus, a sound data retrieving apparatus, a method for extracting sound features, a method for registering sound data, a method for retrieving sound data, and relevant programs for implementing those methods by using a computer.
2. Discussion of the Related Art
Hard disk drives and CD players with changer are types of the sound data base for storing large amounts of sound data. For retrieving a desired sound data or music piece from the sound data base, the use of a keyword such as a title, a singer, or a writer/composer of the music piece is common.
A conventional sound data retrieving apparatus (referred to as an SD retrieving apparatus hereinafter and throughout drawings) will now be explained referring to FIG. 1. FIG. 1 is a block diagram of a system arrangement of the SD retrieving apparatus. A selection query inputting part 11 (referred to as an SLQ input part hereinafter and throughout drawings) is provided for entering a requirement, e.g. a title, for selecting the sound data to be retrieved. A sound database 12 contains sound information such as titles, singers, and writers/composers and can thus be accessed any time. A sound information retriever 13 (referred to as an SI retriever hereinafter and throughout drawings) is provided for accessing the sound database 12 with a retrieving key such as a title entered from the SLQ input part 11 to retrieve and obtain some sound data equal or similar to the key data. A play sound selecting part 14 (referred to as a PS selector hereinafter and throughout drawings) is provided for finally selecting the desired sound dada by the user from the outcome obtained by the SI retriever 13. A sound output part 15 is provided for reading out from the sound database 12 and reproducing a sound signal of the sound data selected by the PS selector 14.
The action of the sound data retrieving system is explained in conjunction with an example. It is assumed that a user desires to retrieve and listen to a sound data A. The user enters “A” on the title section of the SLQ input part 11 to command the retrieval of sound data which include “A” in their titles. In response, the SI retriever 13 accesses the sound database 12 for retrieving some sound data including “A” in their titles and releases output of some sound data. It is now assumed that the sound data include three different titles “A1”, “A2”, and “A3”. Using the three titles, the user directs the PS selector 14 to examine their relevant sound information, such as singers and writers/composers, and selects one of the sound data. The selected sound data is then reproduced by the sound output part 15.
However, the sound information including titles, singers, and writers/composers may be objective or external data. It is hence difficult to assume the subjective impression attributed to the user from the sound information. For example, the selection of a sound data based on a subjective expression “lively sound data” will hardly be realized with any conventional SD retrieving apparatus.
Such psychological impression over audible sounds of the sound data may be quantized as numerical data or a sound impression value. It is possible for implementation of the retrieval of a sound data from its sound impression value to index (quantize) and register the subjective impression on each sound data in the sound database 12 which can then be retrieved. However, the indexing and registering of the subjective impression on sound data largely depends on the user or operator of the system. Accordingly, when sound data to be registered is huge in the amount, its handling will be a troublesome task.
The sound data retrieving technique of the present invention is capable of extracting the physical features from the sound signal of each sound data and retrieving the sound data desired by users using the subjective sound impression value determined over the sound data.
Meanwhile, such a sound features extractor (referred to as an SF extractor hereinafter and throughout drawings) in the sound data retrieving system may be implemented by a tempo extractor. Tempo represents the speed of a sound data and is an inverse of the cycle of beat. Tempo is generally expressed by the number of quarter notes per minute. One of conventional tempo extractors is disclosed in Japanese Patent Laid-open Publication (Heisei) 5-27751, “Tempo extraction device used for automatic music transcription device or the like”.
The conventional tempo extractor is shown in FIG. 2. The conventional tempo extractor comprises a signal receiver 21, a measure time length calculator 27, and a temp calculator 26. The measure time length calculator 27 includes a power calculator 22, a differentiator 23 (referred to as a Diff), an auto-correlation calculator 24 (referred to as an ACR Calc throughout the drawing), and a peak detector 25. The measure time length calculatot 27 denoted by the broken line is provided for calculating the measure time length as a reference length.
The signal receiver 21 is provided for sampling sound signals. The power calculator 22 calculates power of a sound signal received in each processing frame. The differentiator 23 differentiates the power of each processing frame determined by the power calculator 22. The auto-correlation calculator 24 calculates an auto-correlation function of the differentiated power determined by the differentiator 23. The peak detector 25 detects the peak of the auto-correlation function to determine the periodic property of the sound signal and thus the time length of a measure as the reference length. The tempo calculator 26 hence calculates the tempo of the sound data from the measure time length and the number of beats entered separately.
More specifically, a sound signal received by the measure time length calculator 27 is processed by the power calculator 22 and the differentiator 23 to determine a power variation. The periodic property of the power variation is calculated by the auto-correlation calculator 24. The cycle peak where the periodic property is most exhibited is determined by the peak detector 25 on the basis of a reference time length that a human being naturally perceives one beat. As the time cycle is assigned as the reference measure time length, it is divided by the number of beats to determine the number of quarter notes per minutes or the tempo.
However, the peak of the auto-correlation function of the power variation may not always appear in the measure time length or time cycle. For example, when the accent of a snare drum is emphasized in the half note cycle such as of a popular, rhythm instrument oriented music score, the peak of the auto-correlation function of the power variation appears at intervals of a time equal to the time length of the half note cycle. If the peak is treated as the measure time length, the tempo may be calculated to twice the actual tempo. It is also necessary for the conventional system to input the number of beats or other data from a keyboard in advance. Accordingly, for determining the tempo, priori knowledge about the music to be handled is necessary.
The sound features extracting technique of the present invention is capable of extracting the features of a sound data without depending on the type of the sound data entered or without preparing priori data about the sound data.