1. Field of the Invention
This invention generally relates to an apparatus, a method, and a computer program for classifying music pieces represented by audio signals. This invention particularly relates to an apparatus, a method, and a computer program for classifying music pieces according to category such as genre through analyses of audio data representing the music pieces.
2. Description of the Related Art
Japanese patent application publication number 2002-278547 discloses a system composed of a music-piece registering section, a music-piece database, and a music-piece retrieving section. The music-piece registering section registers audio signals representing respective music pieces and ancillary information pieces relating to the respective music pieces in the music-piece database. Each audio signal representing a music piece and an ancillary information piece relating thereto are in a combination within the music-piece database. Each ancillary information piece has an ID, a bibliographic information piece, acoustic feature values (acoustic feature quantities), and impression values about a corresponding music piece. The bibliographic information piece represents the title of the music piece and the name of a singer or a singer group vocalizing in the music piece.
The music-piece registering section in the system of Japanese application 2002-278547 analyzes each audio signal to detect the values (the quantities) of acoustic features of the audio signal. The detected acoustic feature values are registered in the music-piece database. The music-piece registering section converts the detected acoustic feature values into values of a subjective impression about a music piece represented by the audio signal. The impression values are registered in the music-piece database. Examples of the acoustic feature values are the degree of variation in the spectrum between frames of the audio signal, the frequency of generation of a sound represented by the audio signal, the degree of non-periodicity of generation of a sound represented by the audio signal, and the tempo represented by the audio signal. Another example is as follows. The audio signal is divided into components in a plurality of different frequency bands. Rising signal components in the respective frequency bands are detected. The acoustic feature values are calculated from the detected rising signal components.
The music-piece retrieving section in the system of Japanese application 2002-278547 responds to user's request for retrieving a desired music piece. The music-piece retrieving section computes impression values of the desired music piece from subjective-impression-related portions of the user's request. Bibliographic-information-related portions are extracted from the user's request. The computed impression values and the extracted bibliographic-information-related portions of the user's request are combined to form a retrieval key. The music-piece retrieving section searches the music-piece database in response to the retrieval key for ancillary information pieces similar to the retrieval key. Music pieces corresponding to the found ancillary information pieces (the search-result ancillary information pieces) are candidate ones. The music-piece retrieving section selects one from the candidate music pieces according to user's selection or a predetermined selection rule. The search for ancillary information pieces similar to the retrieval key has the following steps. Matching is implemented between the extracted bibliographic-information-related portions of the user's request and the bibliographic information pieces in the music-piece database. Similarities between the computed impression values and the impression values in the music-piece database are calculated. For example, the Euclidean distances therebetween are calculated as similarities. From the ancillary information pieces in the music-piece database, ones are selected on the basis of the matching result and the calculated similarities.
Japanese patent application publication number 2005-316943 discloses the selection of at least one from music pieces. According to Japanese application 2005-316943, a first storage device stores data representing music pieces, and a second storage device stores data representing the actual mean values and unbiased variances of feature parameters of the music pieces. Examples of the feature parameters for each of the music pieces are the number of chords used by the music piece during every minute, the number of different chords used by the music piece, the maximum level of a beat in the music piece, and the maximum level of the amplitude concerning the music piece. The second storage device further contains a default database having data representing reference mean values and unbiased variances of feature parameters for each of different sensitivity words. When a user designates a sensitivity word for music-piece selection, the reference mean values and unbiased variances corresponding to the designated sensitivity word are read out from the default database. The value of conformity (matching) between the readout mean values and unbiased variances and the actual mean values and unbiased variances is calculated for each of the music pieces. Ones corresponding to larger calculated conformity values are selected from the music pieces.
Japanese patent application publication number 2004-163767 discloses a system including a chord analyzer which performs FFT processing of a sound signal to detect a fundamental frequency component and a harmonic frequency component thereof. The chord analyzer decides a chord constitution on the basis of the detected fundamental frequency component. The chord analyzer calculates the intensity ratio of the harmonic frequency component to the fundamental frequency component. From the decided chord constitution and the calculated intensity ratio, a music key information generator detects the music key of a music piece represented by the sound signal. A synchronous environment controller adjusts a lighting unit and an air conditioner into harmony with the detected music key.
One of factors deciding an impression about a music piece is the degree of musical pitch strength defined in auditory sense (hearing sense) and related to the music piece, that is, the degree of hearing-related feeling of a musical interval related to the music piece. For example, a music piece consisting mainly of sounds made by definite pitch instruments (fixed-interval instruments) such as a piano causes a strong sense of pitch strength. On the other hand, a music piece consisting mainly of sounds made by indefinite pitch instruments (interval-less instruments) such as drums causes a weak sense of pitch strength. The degree of a sense of pitch strength closely relates with the genre of a music piece.
Another factor deciding an impression about a music piece is a hearing-related feeling about the thickness of sounds. The thickness of sounds depends on the number of sounds simultaneously generated and the overtone structures of played instruments. The thickness of sounds closely relates with the genre of a music piece. Suppose that there are two music pieces which are the same in melody, tempo, and chord. Even in this case, when the two music pieces are different in the number of sounds simultaneously generated and the overtone structures of played instruments, impressions about the music pieces are different accordingly.
It is unknown to use the degree of a sense of pitch strength and the thickness of sounds as feature quantities regarding each of music pieces.