A function for clipping a specific scene containing a certain feature for viewing from long-time video audio signal is used for devices for recording and viewing TV programs (recorders), for example, and is referred to as “highlight playback” or “digest playback”, for example. Conventionally, the technology for clipping a specific scene includes analyzing video signals or audio signals for calculating parameters each representing feature of the signals, and classifying the input video audio signal by performing determination according to a predetermined condition using calculated parameters, thereby clipping a section to be considered as the specific scene. The rule for determining the specific scene differs depending on the content of the target input video audio signal and a function for providing a type of scene to the viewers. For example, if the function is for playing exciting scenes in sport programs as the specific scene, the level of cheer by the audience included in the input audio signals is used for the rule to determine the specific scene. The cheer by the audience has a property of noise in terms of audio signal characteristics, and may be detected as the background noise included in the input audio signal. An example of determination process on the audio signals using the signal level, peak frequency, major voice spectrum width of the sound, and others is disclosed (see Patent Literature 1). With this method, it is possible to use the frequency characteristics and the signal level change in the input audio signal to identify the section including the cheer by the audience. However, there is a problem that it is difficult to obtain stable determination result since the peak frequency is sensitive to the change in the input audio signal, for example.
On the other hand, as a parameter for smoothly and precisely representing the spectrum change in the input audio signal includes a parameter for presenting an approximate shape of the spectrum distribution which is referred to as spectrum envelope. Typical examples of the spectrum envelope include Linear Prediction Coefficients (LPC), Reflection Coefficients (RC), Line Spectral Pairs (LSP), and others. As an example, a method using LSP as a feature parameter, and the amount of change in the current LSP parameter with respect to moving average of the LSP parameters in the past as one of determination parameter has been disclosed (see Patent Literature 2). According to this method, it is possible to determine whether the input audio signal is a background noise section or a speech section stably, using the frequency characteristics of the input audio signal, and can classify the sections.