With regards to video recorders, such as digital cameras, there is a demand for a function to extract, from an AV content captured by a user, a section that the user finds interesting (hereinafter referred to as an “interesting section”).
One conventional example is a video editing device or a video recorder that can extract an interesting section by having the user determine the start time of the interesting section by operating a controller (for example, by pressing an input button of the controller) and then determine the end time by operating the controller again. One example of video editing using a video editing device is editing by using a PC and video editing software.
With this video recorder, however, the user needs to operate the controller at the appropriate time while watching the AV content in order to extract the interesting section as desired. The user is thus required to have a certain amount of proficiency in operating the controller at the appropriate time while watching the AV content. If the user fails to determine the start and end times of the interesting section appropriately, the user needs to repeat the same controller operations again while watching the AV content. Such an approach therefore requires time and effort to extract an interesting section.
In view of the above, a video editing device has been proposed that has a function to set a start-point and an end-point by adding an offset time, set by the user in advance in accordance with the content, to a time indicated by the user (see Patent Literature 1). For example, setting this video editing device so that the time of the start-point is brought forward from the user-designated time by the offset time allows for a desired start-point to be included in the interesting section even if the user is late in indicating the in-time, thereby extracting a more appropriate interesting section.
As another example, technology for extracting an interesting section has been proposed whereby an acoustic feature condition for a start time (start-point) of the interesting section and an acoustic feature condition for an end time (end-point) of the interesting section are set in advance, and the interesting section is extracted by determining the start-point and the end-point based on these acoustic feature conditions (see Patent Literature 2).
Yet another example of proposed technology displays an acoustic waveform together with a bar indicating playback time during video playback, thereby making the acoustic waveform viewable along with the video. In this way, this technology supports the extraction of a starting point and ending point based on contour information on the amplitude of a sound (see Patent Literature 3).
Other proposals include technology for a simple method to cue sound for video in a broadcast content or a commercial by detecting the starting and ending point of a sound, in particular of speech, based on whether the contour of the amplitude power (envelope) exceeds a set threshold (see Patent Literature 4).