1. Field of the Invention
The present invention relates to an apparatus which detects music (musical piece) sections from an audio including speech sections and music sections in a mixed manner.
2. Description of Related Art
In general, an aired audio often includes sections carrying speeches of an announcer and music sections in a mixed manner. When a listener wishes to record his/her favorite musical piece while listening to the audio, the listener has to manually start recording the musical piece at a timing when the musical piece begins, and to manually stop recording the musical piece at a timing when the musical piece ends. These manual operations are troublesome for the listener. Moreover, if a listener suddenly decides to record a favorite musical piece which is aired, it is usually impossible to thoroughly record the musical piece from its beginning without missing any part. In such case, it is effective to record an entire aired program first, and then extract the favorite musical piece from the recorded program by editing. This editing becomes easier by separating music sections from the aired program beforehand and by playing back only the separated music sections.
To this end, a technology for automatically separating music sections and speech sections from each other by analyzing characteristics of each of the sections. A technology disclosed by Japanese Patent Application Laid-Open Publication No. 2004-258659 is for separating a musical piece and a speech from each other by using characteristic amounts in terms of frequencies such as mel-frequency cepstral coefficients (MFCCs). However, the technology disclosed by the Publication No. 2004-258659 has a problem that a process for calculating the characteristic amount in a frequency area of an audio signal becomes vast because the process is so complicated that the workload for the process becomes large.