1. Field of the Invention
The present invention relates to a chapter information creation apparatus that creates chapter information for video image data and a control method therefor, and more particularly to a chapter information creation apparatus that creates chapter information for video image data obtained by hierarchically encoding video image content and a control method therefor.
2. Description of the Related Art
Heretofore, techniques for detecting scene change positions or the like in video image content that is being recorded or played back, and creating information specifying the detected positions as chapter information are known in video image cameras, broadcast program recording apparatuses and the like. Chapter information is, for example, recorded in the data of video image content, and used in cue playback of video image content, editing and the like.
For example, Japanese Patent Laid-Open No: 2006-108729 discloses a technique for detecting scene changes between frames of video image content from the difference between the frames, and automatically creating chapter information.
As for examples of hierarchical encoding schemes for video image content, on the other hand, H.264/SVC (Scalable Video Coding), which is an enhanced version of H.264/AVC (Advanced Video Coding), has become standardized. Use of a hierarchical encoding scheme enables video image data having a plurality of resolutions to be hierarchized and encoded in the data of a single video image stream. For example, video images having a plurality of resolutions in the same video image content, such as 640×480 pixel SD resolution and 4096×2160 pixel 4K2K resolution, can be hierarchized and encoded in the data of a single stream.
The field of view can also be differentiated between layers, such that a layer in SD resolution is a close-up of a face and a layer in 4K2K resolution is a full body shot.
In the case where scene changes are detected and chapter information is automatically created in a conventional manner with respect to hierarchically-encoded video image content, chapter information can be created at one given layer by applying a conventional scheme and commonly used at all layers if the field of view is the same between layers.
However, in the case of video image content having different fields of view between the layers, there could possibly be a scene change at one layer but not at another layer. For example, consider the case where there is a video image of a scene including a number of people at a high resolution layer and a video image of a close up of one of the people at a low resolution layer. In this case, the video image of the low resolution layer could change to a close up of another person, even though there is not a significant change in the video image at the high resolution layer. Thus, with video image content that has been hierarchically encoded to have different fields of view between encoded layers, chapter information needs to be created for each layer having a different field of view.
For example, in the case where chapter information is created using the method disclosed in Japanese Patent Laid-Open No. 2006-108729, scene changes need to be detected by analyzing video images for each encoded layer having a different field of view, giving rise to the problem of increased processing. In particular, scene analysis of a given layer requires decoding of that layer, leading to an increase in processing over and above the increase in analysis processing.