1. Field of the Invention
The present invention relates to a method and apparatus for recording and playing back moving picture data and, more particularly, a method and apparatus for recording and playing back moving picture data adapted to partial recording of a digital video moving picture data sequence to which intra frame coding is not applied so frequently.
2. Description of the Related Art
In an application of distributing a digital moving picture simultaneously to an indefinite number of users in a real-time manner like digital broadcasting, a data structure that each of terminals can instantaneously start a process of decoding received moving picture data, that is, a coded data (bitstream) format which can be decoded from arbitrary time is adopted.
FIG. 2 shows a data structure of the MPEG system as an example of a moving picture bitstream in digital broadcasting. In the MPEG system, three kinds of coding systems called intra frame coding, inter frame coding, and bidirectionally predictive video coding are defined.
The “intra frame coding” is a data compression method of directly performing DCT (Discrete Cosine Transform) on a picture of a current frame. A frame to which the intra frame coding is applied is called an I-VOP (Intra-coded Video Object Plane) in MPEG-4 and an I-Picture in MPEG-2. “VOP” is a synonym of “frame” in a rectangular picture. Since an I-VOP does not require decoding information of a preceding frame at the time of coding and decoding, it is used as a decoding start frame when a coded moving picture is accessed at random.
The “inter frame coding” is a coding method of compressing data of a frame by using coded information of a frame which is preceding the object frame with respect to time. A frame to which the inter frame coding is applied is called a P-VOP (Predictive-coded VOP) in MPEG-4 and a P-Picture in MPEG-2. The “bidirectionally predictive video coding” is a method of compressing data of a frame by using coded information of two frames which are preceding and subsequent to the frame with respect to time. A frame to which the bidirectionally predictive video coding is applied is called a B-VOP (Bidirectionally predicted-coded VOP) in MPEG-4 and a B-picture in MPEG-2.
In the following description, it is assumed that a data structure of an MPEG-4 bitstream is used, and reference characters I, P, and B in the drawings denote I-VOP, P-VOP, and B-VOP, respectively.
An MPEG bitstream shown in FIG. 2 has a data structure in which header information 201 showing characteristics of a whole sequence such as a bitstream size and an I-VOP 202 are periodically inserted. By using the data structure, for example, even when an operation of receiving a bitstream is started from a data portion 203 constructed by P-VOPs and B-VOPs surrounded by broken lines, by waiting for the subsequent header information 201 and the I-VOP 202 on the terminal side, a process of decoding the received data can be started. According to the data structure, therefore, by setting the size of the data portion 203 to a degree that the user is not annoyed by waiting time, moving picture delivering service which is instantaneously provided to each user can be realized.
In digital broadcasting of which service has just started recently, the header information 201 of the MPEG-4 is constructed by, for example, as shown in FIG. 3, a VOS header 201-1, a VO header 201-2, a VOL header 201-3, and a GOV header 201-4.
The VOS header 201-1 includes profile level information for determining an application range of an MPEG-4 product. The VO header 201-2 includes version information of the MPEG-4. The VOL header 201-3 includes information such as picture size, a coding bit rate, a frame memory size, and an application tool, which is indispensable to decode received data. The GOV header 201-4 includes time information used for reserved playback and the like.
The data structure in which the header information 201 and the I-VOP 202 are periodically inserted is effective for video recording and random access to recorded data. For example, in video recording, by a analyzing the header information 201 which appears first after the user presses a recording start button, subsequent data can be recorded. Since an MPEG decoder does not deteriorate the quality of a whole decoded picture sequence even when a process of decoding a B-VOP or B-picture is skipped, by frequently inserting the B-VOP to the data portion 203, fast forward, quick motion playback, and the like of recorded data is facilitated.
In real-time communication using a radio channel, it is difficult to frequently insert the I-VOP into a bitstream due to limitations of transmission delay, communication capacity, data transmission error, and consumed power. Consequently, in video streaming service on demand using a radio channel as a precondition, for example, as shown in FIG. 4, coded data in which usage of the I-VOP is avoided as much as possible has to be used.
The coded data shown in FIG. 4 has, different from FIG. 2, a data structure in which a long data sequence 313 consisting of a number of continuous P-VOPs is disposed after header information 311 and an I-VOP 312. In this case, it is general to insert intra-coded blocks in a P-VOP sequence to correct a transmission error. The intra-coded block plays a role of refreshing the picture quality of a block deteriorated by a transmission error. In the case of the MPEG, one VOP is divided into a plurality of blocks each having a size of 16×16 pixels, and the position of a current coding block is periodically changed so as to refresh all the coding blocks in several VOPs. In the video streaming service, for example, as shown in FIG. 5, there is a case that the header information 311 is constructed by a VOS header 311-1, a VO header 311-2, and a VOL header 311-3 and includes no GOV header.
When the frequency of inserting the I-VOP into a bitstream is reduced, the size of the data portion 313 following the I-VOP becomes large. It consequently causes a problem such that, even in the case where only a specific part in a received stream (coded data sequence) is desired to be recorded on a terminal side, in practice, a number of unnecessary groups of frames received since the I-VOP until a desired picture frame have to be also recorded.
For example, when it is assumed that the bitstream shown in FIG. 4 is moving picture data of three minutes, even when the user wishes to record moving pictures of only 15 seconds positioned at the last part of the bitstream, the whole bitstream from the header 311 and the I-VOP 312 indispensable for decoding to the target picture frames has to be recorded. For example, in a terminal having therein a memory of a small capacity, such as a cellular phone, even if the user wishes to selectively record a specific stream portion selected from received moving pictures, there is the possibility that the built-in memory becomes full before the target stream portion arrives, so that the recording fails. In order to record a partial stream including desired picture frames with reliability, it is necessary to preliminarily connect an external storage having a sufficient memory capacity to the terminal.