Conventionally, in the research area of music information processing a number of research studies have been conducted on musical retrieval or music understanding. On the contrary, there have been no studies focusing on trial listening of music. At present, in “trial listening” to music prerecorded on compact discs (CDs), at a music store, a trial listener often picks out only those sections of interest while fast-forwarding the CD, and listens to them. This is because the main object of trial listening is to quickly determine whether a selection is the one piece of music the listener has been looking for, and whether or not he/she likes it. In the case of popular music, for example, customers often decide by trial listening to sections having some characteristic music structure (hereinafter referred to as characteristic music structure sections), such as chorus sections (i.e., chorus or refrain) that are the most representative, uplifting part of the music, or melody sections that are usually performed repeatedly. This produces a special way of listening in which the trial listener first listens briefly to the music's “intro”, then skips middle parts of the music by pushing the fast-forward button repeatedly in search of characteristic music structure sections such as chorus or repeated sections, and eventually playing back the characteristic music structure section.
The functions provided by conventional listening stations for music CDs, however, do not support this unique way of trial listening. These listening stations are equipped with playback-operation buttons typical of an ordinary CD player, and among these, only the fast-forward and rewind buttons can be used to find the chorus section (of the music). On the other hand, digital listening stations that have recently come to be installed in CD stores enable playback of several hundred thousands of musical selections stored in MP3 or other compression formats from a hard disk or over the network. However, as only the beginning of each musical selection (an interval of about 45 seconds) is mechanically excerpted and stored, a trial listener may not necessarily hear the characteristic music structure part. Although recently music that begins with the chorus is on the increase in Japan's popular music world, according to the inventor's survey, only about 20% of the pieces of music on Japan's popular music hit chart (top 20 singles ranked weekly from January to December 2001) was featuring a chorus that begins within 40 seconds from the start of the music.
In one of the conventional chorus detection methods, one chorus section of a specified length is incompletely extracted as a representative part of audio signals of a piece of music. Logan, B. and Chu, S., Music Summarization Using Key Phrases, Proc. Of ICASSP 2000, II749–752 (2000), proposed a method of labeling a short extracted frame (1 second) based on acoustic features thereof, wherein a frame having the most frequent label is considered as a chorus. The labeling utilized clustering based on similarity in acoustic features among respective sections, or hidden Markov model. Bartsch, M. A. and Wakefield, G. H., To Catch A Chorus: Using Chroma-based Representations for Audio Thumbnailing, Proc. of WASPAA 2001, 15–18 (2001), proposed a method of dividing a piece of music into short frames for every beat based on the result of beat tracking, and extracting a part, as a chorus, which has the highest similarity of acoustic features thereof across sections of a certain specified length. Foote J., Automatic Audio Segmentation Using a Measure of Audio Novelty, Proc. of ICME 2000, I-452–455 (2000), pointed out a possibility that a chorus can be extracted, as an application of detecting a boundary based on similarity in the acoustic features among very short fragments (frames).
Although there are the prior art intended for expression equivalent to musical notes such as a standard MIDI file, etc. (e.g., Meek, C. and Birmingham, W. P., Thematic Extractor, Proc. of ISMIR 2001, 119–128 (2001); and Jun Muramatsu, Extraction of Features in Popular Songs Based on Musical Notation Information of “Chorus” —Case of Tetsuya Komuro, The special Interest Group Note of IPSJ, Music Information Science, 2000-MUS-35-1, 1–6 (2000)), this technology could not be directly applied to mixed sounds wherein it was difficult to separate sound sources. The conventional chorus section detecting method could simply extract and present sections of a certain specified length at any given time, and could not estimate where the chorus sections begin and end. Furthermore, no prior art have taken modulation into consideration.
An object of the present invention is to provide a method and a system capable of easily playing back characteristic music structure sections selected by an interface by using a musical audio data playback apparatus, and an interface and a program to be used for the system.
Another object of the present invention is to provide a music playback method and system capable of easily playing back particularly chorus sections in music by using a musical audio data playback apparatus, and an interface to be used for the system.
Further another object of the present invention is to provide a music playback method and system capable of reliably identifying chorus sections in music, and an interface to be used for the system.
Still another object of the present invention is to provide a music playback method and system capable of visually checking distribution of characteristic music structure sections and playback status of musical audio data, and an interface to be used for the system.
Yet another object of the present invention is to provide a music playback method and system capable of visually distinguishing the presence of chorus sections and repeated sections, and an interface to be used for the system.
Another object of the present invention is to provide a music playback method and system capable of selectively playing back characteristic music structure sections merely with an operator's manipulation of selection buttons, and an interface to be used for the system.
Further another object of the present invention is to provide a method for easily extracting characteristic music structure sections from statistical data.
Still another object of the present invention is to provide a method, a system, and a program for detecting a chorus section in music audio data, whereby problems with the prior art can be solved and all and any chorus sections appearing in music can be detected.
Yet another object of the present invention is to provide a method, a system, and a program for detecting a chorus section in musical audio data, whereby it can be detected where one chorus section begins and ends.
Another object of the present invention is to provide a method, a system, and a program for detecting a chorus section in musical audio data, whereby a modulated chorus section can be detected.
Further another object of the present invention is to provide an apparatus for detecting a chorus section in musical audio data, whereby not only chorus sections but also other repeated sections can be displayed onto a display means.
Still another object of the present invention is to provide an apparatus of detecting a chorus section in musical audio data, whereby not only chorus sections but also other repeated sections can be played back.