In many cases, since the volume of video content in digital form is enormous without compression, the video content in digital form is compressed, before recorded, according to MPEG-2 video scheme (see “ISO 13818-2:2000, Generic coding of moving pictures and associated audio information: Video”) or H.264 scheme (see “ITU-T Recommendation H.264 (03/05), Advanced video coding for generic audiovisual services”), for example.
In these compression schemes, it is, however, impossible to start playing back the content at any midpoint time thereof because interframe prediction is performed for achievement of higher compression ratios. Note that a frame at which playback can be started is termed keyframe (random access point), and the keyframe corresponds to I-picture in the MPEG-2 video scheme and corresponds to IDR picture in the H.264 scheme.
For special playback such as playback of content at some midpoint and fast-forward/fast-rewind playback, i.e. random access to a file, it is important to manage where the keyframe at what time is located in recorded data.
A file format including keyframe management information is, for example, ASF (Advanced Systems Format) disclosed in Non-patent document 1. In the ASF, a structure termed index object is provided in a file. By recording the location and time of the keyframe in the index object, convenience to carry out special playback increases.
The following will discuss the index object of the ASF and its use with reference to FIGS. 11, 12, and 13.
FIG. 11 is an explanatory view of the general structure of an ASF file. Reference numeral 1101 represents a header object, in which a common attribute in the entire file is recorded. Reference numeral 1102 represents a data object, in which compressed data of video and audio is stored. The data object is made up of a plurality of packets. Reference numeral 1103 represents one packet included in the data object 1102. All the packets are of fixed length. Generally, since one frame of compressed video image is larger than one packet, the video image is stored in many cases in such a manner that one frame is divided into a plurality of packets. As in the above case, division into packets is performed in a case where video and audio are stored at the same time. In the header object 1101, ID information is described for identifying a packet in which video frame is stored and a packet in which audio data is stored. Reference numeral 1104 represents an index object, which will be described in detail later. In the header object 1101, the location of the index object 1104 is recorded so that the index object can be easily searched for.
FIG. 12 is an explanatory view showing the general structure of the index object (herein termed simply “index object” although termed “simple index object” in Non-patent document 1). Reference numeral 1201 represents a time interval T between index entries. Reference numeral 1202 represents the total number N of index entries. Reference numeral 1203 represents a sequence of the index entries, number of which is equal to the total number N of index entries. The index entries are temporally arranged at regular intervals T, so that an index entry corresponding to a given time can be easily searched for. Reference numeral 1204 represents one index entry in the sequence of the index entries. The index entry is composed of a packet number field and a packet count field, which are represented by reference numerals 1205 and 1206, respectively. In the packet number field 1205 is stored a head packet number of a keyframe that is located at a point closest to a time indicated by the index entry. In the packet count field 1206 is stored the number of packets required for reconstruction of the keyframe concerned.
FIG. 13 is an explanatory view showing correspondence between index entries and video frames. Rectangles at the top of FIG. 13 represent video frames. Among them, rectangles with hatching represent keyframes, and rectangles without hatching represent frames other than the keyframes. A number line at the bottom of FIG. 13 represents time, and a time at which an index entry exists is indicated by black circles. As an example, take the case where playback is desired to be started after T×k seconds from the head of motion picture content. Since index entries exist at time intervals T, an index entry that exists after T×k seconds is an index entry 1301, which is a k+1th index entry from the head of the motion picture content. However, since no keyframe exist at a point in time corresponding to T×k seconds from the head of the motion picture content, the index entry 1301 indicates a keyframe 1302, which is closest to the point in time T×k seconds from the head of the motion picture content. More specifically, (a) a packet number of a packet in which the first keyframe 1302 is stored and (b) the total number of packets required for reconstruction of the keyframe 1302 are described in the index entry 1301. For playback, packets required for reconstruction of the keyframe 1302 are extracted by calculating the location of the packet in recorded data from the packet number, so that playback can be started at a time closest to time T×k.    [Non-Patent Document 1]“Advanced Systems Format (ASF) Specification, Revision 01.20.03”, [online], December in 2004, Microsoft Corporation, [Searched on Apr. 15, 2005], Internet <URL: http://www.microsoft.com/windows/windowsmedia/format/asfspec.aspx>