1. Field of the Invention
The present invention relates to a recording apparatus, a recording method, and a recording program, a recording/reproducing apparatus, a recording/reproducing method, a recording/reproducing program, an editing apparatus, an editing method, and an editing program for performing an editing processing on a clip configured to include AV (Audio/Video) stream data.
2. Description of the Related Art
In the related art, a DVD (Digital Versatile Disc) having a recording capacity equal to or larger than 4.7 GB (Giga Bytes) has been prevailing as a recording medium that is detachable from a recordable recording/reproducing apparatus, that has a relatively large recording capacity, and that is suited to record AV (Audio/Video) data including video data and audio data. Japanese Patent Application Laid-Open (JP-A) No. 2004-350251 discloses an imaging apparatus for recording AV data in a recordable DVD in DVD-Video format.
The AV data recorded in such a recordable recording medium is desirably editable. A specific example of a method of editing AV data is a method of editing AV data while dividing the AV data into predetermined units. Video data and audio data included in the AV data are generally recorded in the recording medium as predetermined multiplexed data units.
Furthermore, the video data is generally recorded in the recording medium after being compression-coded by a predetermined system because the video data is considerably large in data capacity. As standard systems for compression-coding the video data, MPEG2 (Moving Picture Experts Group 2) systems are known. Moreover, ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) Recommendation H.264 and ISO (International Organization for Standardization)/IEC (International Electrotechnical Commission) International Standard 14496-10 (MPEG-4 Part 10) Advanced Video Coding (hereinafter, abbreviated as “H.264|AVC”) that enable efficient coding by further advancing the MPEG2-based compression coding have been also widespread.
According to the MPEG2 and the H264|AVC systems, not only intraframe coding using orthogonal transform but also intraframe coding using predictive coding based on motion compensation are performed, thereby further improving compression ratio. Using MPPEG2 as an example, an interframe-compression based on predictive coding will be described below.
Outline of a data stream structure according to the MPEG2 will first be described. The MPEG2 is a combination of motion-compensation-based predictive coding and DCT (discrete cosine transform)-based compression coding. According to the MPEG2, data has a hierarchical structure in which a block layer, a macroblock layer, a slice layer, a picture layer, a GOP (Group Of Picture) layer, and a sequence layer are arranged from below. The block layer is configured to include DCT blocks that are units for performing the DCT. The macroblock layer is configured to include a plurality of DCT blocks. The slice layer is configured to include a header part and one or more macro blocks. The picture layer is configured to include a header part and one or more slices.
One picture corresponds to one image plane. Boundaries of the layers are made identifiable by predetermined identification codes, respectively.
The GOP layer is configured to include a header part, an I (Intra-coded) picture based on intraframe coding, a P (Predictive-coded) picture based on predictive coding, and a B (Bidirectionally predictive-coded) picture based on predictive coding. The I picture includes information only and is decodable per se. The P and B pictures are undecodable per se unless a temporally previous image and temporally previous and later images are used as a reference image and reference images, respectively. For example, the P picture is decoded using a temporally previous I or P picture as a reference image. The B picture is decoded using two temporally previous or later I or P pictures as reference images. A self-contained group including at least one I picture is referred to as “GOP”, which is regarded as an independently accessible minimum unit in a MPEG stream.
The GOP is configured to include one or a plurality of pictures. It is assumed hereafter that the GOP is configured to include a plurality of pictures. Types of the GOP are two, i.e., a closed GOP completely decodable in itself and having a closed structure and an open GOP decodable using information on a GOP closest past the open GOP in order of coding. The open GOP can ensure higher image quality and is used more often than the closed GOP because the open GOP is decodable using more information than that used by the closed GOP.
Referring to FIGS. 1A to 1C, a processing for decoding interframe-compressed data will be described. It is assumed that one GOP includes one I picture, four P pictures, and 10 B pictures, i.e., 15 pictures in all. As shown in FIG. 1A, the I, P, and B pictures are displayed in a display order of “B0B1B2B3B4P5B6B7P8B9B10P11B12B13P14”. In FIGS. 1A to 1C, each subscript indicates a display order number.
In the example, the first two B pictures, i.e., B0 and B1 pictures are obtained after being predicted from and decoded using a last P14 picture in a GOP closest past this GOP and an I2 picture in the same GOP. The first P picture, i.e., P5 picture in the GOP is obtained after being predicted from and decoded using the I2 picture in the same GOP. Each of the remaining P8, P11, and P14 pictures is obtained after being predicted from and decoded using the closest past P picture. Moreover, each of the B pictures subsequent to the I picture is obtained after being predicted from and decoded using temporally previous or later I and/or P pictures.
Each of the B pictures is predicted and decoded using temporally previous or later I and P pictures. Due to this, it is necessary to decide the display order of the I, P, and B pictures in a stream or recording medium in view of an order of decoding. Namely, it is typically necessary for the I or P pictures used to decode a B picture to be decoded prior to the B picture.
In the example, as shown in FIG. 1B, the pictures are arranged in the stream or recording medium in order of “I2B0B1P5B3B4P8B6B7P11B9B10P14B12B13,” and decoded in this order. In FIG. 1B, each subscript corresponds to that shown in FIG. 1A and indicates a display order number.
A decoding processing performed by a decoder is as follows. As shown in FIG. 1C, the I2 picture is decoded first, and the B0 and B1 pictures are predicted from and decoded using the decoded I2 picture and the rearmost P14 picture (in display order) in a closest past GOP. The B0 and B1 pictures are output from the decoder in order of decoding, followed by output of the I2 picture. After the B1 picture is output, the P5 picture is predicted from and decoded using the I2 picture. Thereafter, the B3 picture and the B4 picture are predicted from and decoded using the I2 picture and the P5 picture. The decoded B3 and B4 pictures are output from the decoder in order of decoding, followed by output of the P5 picture.
Subsequently, the following processing is similarly repeatedly performed. The P or I pictures used to predict the B picture are decoded prior to the B picture, the B picture is predicted and decoded using the decoded P or I pictures, the decoded B picture is output, and the P or I pictures used to decode the B picture are output. The picture arrangement in the recording medium or stream as shown in FIG. 1B is generally used arrangement.