1. Field of the Invention
This invention relates to an apparatus and a method for processing acoustic signals, in which an index to featuring portions in the acoustic signals in e.g. an event is generated, and an apparatus and a method for recording signals, in which the index is imparted to the image signals and/or acoustic signals at the time of recording to enable skip reproduction or summary reproduction. This invention also relates to a program for having a computer execute the acoustic signal processing or recording.
2. Description of Related Art
In broadcast signals, or in image/acoustic signals, recorded therefrom, it is useful to detect crucial scenes automatically to impart an index or to formulate a summary image, in order to enable the contents thereof to be comprehended easily, or in order to retrieve the necessary signal portions expeditiously. Thus, it may be conjectured that, in an image of e.g. a sports event, preparation of a digest of the image or retrieval of a specified scene for secondary use may be facilitated by automatically generating an index to a climax portion and by imparting the index to the image/acoustic signals, such as by multiplexing.
For this reason, there is proposed in the cited reference 1 (Japanese Laying-Open Patent Publication 2001-143451) a technique in which a climax portion of an event, such as a sports event, is automatically detected and imparted as an index, based on the combination of relative values of the power level of the frequency spectrum and that of a specified frequency component. This technique, detecting the sound emitted by the spectators at the climax of the event, can be universally applied to a large variety of the events, and may be used for detecting the signal portions corresponding to crucial points throughout the process of the event.
However, the technique disclosed in the above-mentioned Patent Publication suffers from the problem that, since the factors relating to the sound quality, such as the shape of the spectrum, are not evaluated, the detection precision is basically low, while the technique cannot be applied to such a case where an extraneous sound co-exists in the sound of the specified frequency.
Consequently, the technique can be applied only to acoustic signals, recorded on the event site by professional engineers of e.g. a broadcasting station, and in which there are not mixed other extraneous signals, however, the technique cannot be applied to acoustic signals mixed with an inserted speech, such as announcer's speech, commentator's speech or the commercial message, as exemplified by broadcast signals. Additionally, the technique can scarcely be applied to a case where an armature, such as one of the spectators, records the scene, because the ambient sound, such as speech or conversation, is superposed on the acoustic signals being recorded.