1. Field of Invention
The present invention relates to indexing technology for dividing a content into a plurality of segments based on video and audio analysis of the content.
2. Description of the Related Art
In recent years, large capacity recording apparatuses, such as HDD recorders and DVD recorders, have become widespread for home use. It is becoming common that such a recording apparatus is provided with a function of automatically and selectively recording broadcast contents according to a user's interests and preferences. Owing to this function, it is expected that such a recording apparatus stores a larger amount of contents than ever.
With such a function, broadcast programs as shown by an EPG (Eclectic Program Guide) are recorded as contents. Generally, users seldom watch a recorded content from the beginning to the end, but selectively view specific parts of the content. For example, a user may view a specific piece of news in a news program that interests the user or a specific part of a music show in which the user's favorite singer makes an appearance. In this way, users can effectively retrieve desired information from a large amount of content data.
To this end, attempts have been made to analyze a content for extracting various features of video and audio data. The content is then indexed using the extracted features in combination, and thus divided into a plurality of segments (hereinafter, “viewing segments”).
Specifically, for example, a technique of detecting a transition frame at which a series of frames that are sequentially shot as one scene changes to another scene, is used in combination with a technique of detecting a frame in which a telop (television opaque projector) effect or a caption appears. The use of the techniques allows the detection of frames each located closest to a caption frame among all the transition frames preceding the caption frame. The detected frames are then compared with each other to measure the similarity therebetween. The segmentation is carried out in such a manner that each of the similar frames belongs to a different viewing segment.
However, a problem arises as a result of an increasing number of terrestrial television channels as well as the versatility of available broadcasting styles including satellite broadcasting, cable broadcasting, and video streaming. That is, more and more types of contents are available for viewing at home, so that conventionally known techniques may be insufficient to suitably index all the types of contents.
This insufficiency arises because each genre or broadcast program is different in the segmentation pattern associated with features, such as the size, layout, appearance timing of captions.