The present invention relates to a decoding apparatus and a decoding method for use in a process for reproducing a moving picture stored in a storage device.
In recent years, as an information society greatly advances, a demand for sending moving pictures to other people beyond time and places goes on increasing. In response to this demand, it became possible to record and reproduce the moving pictures by the use of a recording apparatus or transmit them via a communication network over a long distance. A digital technology is employed to transmit/store these information in communication. Also, a coding method using the digital technology is adopted in broadcasting.
For recording a moving picture or an audio signal in a digital format (digital signal), a digital recording medium of a large capacity is used. As available digital recording media, there are a video CD (Compact Disc) which has digital moving pictures recorded in a CD, and a DVD which contains higher-quality and longer digital moving pictures than that recorded in the video CD.
However, these digital recording media do not have storage capacities sufficient to record the moving picture for a long time period. It is therefore essential that a technique for coding the digital signal efficiently (compressing data) be employed in order to transmit and record the moving picture or the audio signal efficiently.
The technique for coding the moving picture or the audio signal has been developed. Currently, methods according to an international standard relating to xe2x80x9cInformation Technologyxe2x80x94Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits/s (ISO/IEC11172-2)xe2x80x9d are used. The international standard is termed xe2x80x9cMPEGxe2x80x9d (Moving Picture Experts Group).
A description will be given of a method for coding a digital moving picture and a bitstream according to MPEG.
FIG. 6(a) shows the digital moving picture according to MPEG. A video frame group 500 called xe2x80x9csequencexe2x80x9d comprising a series of video frames 700 is coded. The sequence is commonly divided into a series of video frame groups 600, each of which is called a group of pictures (GOP) of about 0.5 second.
FIG. 6(b) shows an example of the GOP. As shown in FIG. 6(b), the GOP is composed of I pictures, P pictures, and B pictures. The I picture is obtained by independently coding data corresponding to a video frame, (the data is coded entirely by itself), and called an xe2x80x9cintra-picturexe2x80x9d. The P picture is predicted from a temporally previous (forward) frame (I picture or P picture) and is called a xe2x80x9cforward predictive picturexe2x80x9d. The B picture is predicted from temporally previous and subsequent (forward and backward) frames (I or P pictures), that is, by interpolation between previous and subsequent I or P pictures, and is called a xe2x80x9cbidirectionally Predictive Picturexe2x80x9d.
FIG. 7(a) shows a structure of each picture. Each picture comprises a series of band-shaped regions on a frame which are called xe2x80x9cslicesxe2x80x9d (one or more slices). Each slice comprises one or more xe2x80x9cmacroblocksxe2x80x9d each composed of (16xc3x9716) pixels.
FIG. 7(b) shows an example of a macroblock 800. The macroblock 800 comprise a plurality of image blocks each composed of (8xc3x978) pixels shown in FIG. 7(c). The macroblock shown in FIG. 7(b) comprises 4 blocks corresponding to a luminance signal Y, and 2 blocks corresponding to chrominance signals Cb and Cr. The chrominance signals Cb and Cr each comprises pixels sampled from an original image at a resolution as half as a resolution at which pixels are sampled therefrom to form the luminance signal Y.
In the above hierarchical (layered) structure, sequences, GOPs, pictures, and slices as upper layers are respectively provided with headers each containing hierarchy information. Each header comprises a code sequence called a start code comprising xe2x80x9c0sxe2x80x9d of 23 bits or more and the following xe2x80x9c1xe2x80x9d of 1 bit which are uniquely identifiable on a bitstream, coding information about each layer, extension indicating information about extension from MPEG to MPEG2 (Information Technologyxe2x80x94Generic coding of moving pictures and associated audio for digital information (ISO/IEC13818-2), and the like.
FIG. 8 shows an example of a structure of the bitstream. Turning to FIG. 8, in the macroblock (lower) layer, various types of image coding information is shown. From the head, aligned are a macroblock address increment indicating a distance from a previously (most recently) coded macroblock with respect to a two-dimensional point in a picture, a macroblock type indicating coding mode information about a macroblock type indicating coding mode information about a macroblock to be coded, a quantiser scale indicating a quantisation scale, motion vectors for use in motion compensation, a coded block pattern indicating which coded block data is present in the bitstream, coded DCT (Discrete Cosine Transform) coefficient data, and the like.
Each of these data is represented as a variable length code. To data appearing more frequently, a code of shorter length is assigned. Thereby, these data which occupies most of the bitstream is efficiently coded.
To be specific, the bitstream comprising the sequence (lower) layer is, as shown in FIG. 9, divided in units of fixed length, and stored in the payloads of packets. To each of these payloads, added is a packet header composed of fields such as a packet start code, a stream identifier (ID), a packet length, a PTS (Presentation Time Stamp), a DTS (Decoding Time Stamp), and the like. These packets are multiplexed to create the bitstream.
The data so coded is stored in the digital recording medium such as the video CD or the DVD and then processed by a decoding apparatus, to reproduce a moving picture.
It is essential that the decoding apparatus using the digital storage medium have a capability of reproducing moving pictures in the order as recorded and a capability of performing trick play including fast forward playback and fast rewind playback. Hereinafter, a description will be given of a method for playing back the bitstream by the fast forward playback or the fast rewind playback.
In a normal playback process, all the pictures included in the bitstream are decoded and displayed, while in the fast forward playback process, an image is displayed by any of the following methods. One method is to transfer the bitstream recorded in the digital storage medium to the decoding apparatus, which decodes only the T pictures to be displayed. The other method is to selectively transfer packets containing I picture information from the digital storage medium to the decoding apparatus, which decodes the I pictures to be displayed.
In actuality, there are drawbacks with the use of the former method, including lack of the decoding apparatus""s ability to analyze the bitstream and complicated selection of the I pictures, and therefore, the latter method is commonly used for the fast forward playback process. For instance, when the decoding apparatus reproduces the pictures according to the former method, assuming that the speed of the fast forward playback process is 100 times as high as that of the normal playback process, it requires an ability to analyze the bitstream 100 times as high as that of the decoding apparatus in the normal playback process. Hence, a general decoding apparatus does not satisfy such performance requirements.
Subsequently, a description will be given of the latter decoding method with reference to FIGS. 10(a)-10(c). FIG. 10(a) shows a bitstream on the digital storage medium. FIG. 10(b) shows parts of the bitstream 300, 310, and 320, each comprising packets containing I picture information. FIG. 10(c) is a bitstream (elementary stream) comprising data contained in payloads of the parts of the bitstream 300, 310, 320, . . . .
In the fast forward playback process, the entire bitstream is not supplied to the decoding apparatus but the parts of the bitstream 300, 310, 310, . . . , are sequentially supplied thereto, and the I pictures contained therein are sequentially reproduced.
In the fast rewind playback process, the parts of the bitstream to be supplied to the decoding apparatus are transferred reversely in temporal direction. Specifically, the parts 320, 310, 300, . . . shown in FIG. 10(b) are supplied to the decoding apparatus to be decoded in this order.
In this case, the packet containing the first I picture included in the GOP shown in FIG. 10(a) can be often specified according to management information recorded on a disc, and hence the parts of the bitstream are selectively supplied to the decoding apparatus with ease as intended.
In the fast forward playback process or the fast rewind playback process performed by the above decoding apparatus, discontinuous parts of the bitstream are supplied to the decoding apparatus and a part containing incomplete picture data are connected to the following part. As a consequence, data different from original data is decoded.
To be specific, in the connected portions, a code is parsed incorrectly, which causes a degraded image different from an original image to be reproduced. An error at a connection point is detected, because data in the following portion is parsed in a manner different from that as expected and thereby it is recognized that undefined data has appeared. At the detection of the error, the code parsing has been performed incorrectly for the following part of the bitstream connected to this point, and thereby the image different from the original image is reproduced. Although the part of the bitstream has been selected, the original image is not reproduced and displayed.
On specific example of this will be described.
Turning to FIG. 10(c) again, the bitstream (elementary stream) is illustrated, which is supplied to the decoding apparatus at the fast forward playback process. Picture data contained in the part of the bitstream placed just before a connection point xe2x80x9cAxe2x80x9d is incomplete. Just after the connection point xe2x80x9cAxe2x80x9d, placed is an I picture header for identifying the following I picture.
FIG. 11 shows a bitstream 80 just before and after the connection point A. At the connection point A, connected are DCT coefficient data, i.e., data lower than a macroblock N indicating an I picture to be reproduced and displayed and a picture header for identifying a subsequent I picture.
FIG. 12 shows an example of a code sequence (variable length codes) representing DCT coefficients according to MPEG. As shown in FIG. 12, data lower than the macroblock layer such as the DCT coefficients is represented as variable length codes. For an input bitstream, the variable length codes are decoded and then pictures are decoded.
In the fast forward playback process, packets containing information about required I pictures are selected, and then parts of the bitstream 300, 310, 320, . . . , are sequentially connected to create the bitstream shown in FIG. 10(C). In the part of the bitstream 300 placed just before the connection point A, the DCT coefficient data is incomplete. In this case, although original DCT coefficient data contains a code sequence xe2x80x9c0000 1001 (run2, level-2)xe2x80x9d, a code sequence of the picture header follows the middle of the original DCT coefficient data, and thereby xe2x80x9c0000 1000 (run 2, level 2) is parsed as shown by a code sequence 83. As a consequence, an image different from an original image is decoded, causing a serious degradation in the quality of a reproduced image.
The following part 310 starts with a start code of the picture header, the start code comprising xe2x80x9c0sxe2x80x9d of 23 bits or more and the following xe2x80x9c1xe2x80x9d of 1 bit. However, as a result of incorrect parsing of the DCT coefficient data as shown by the code sequence 83, the following start code is parsed incorrectly, and it is decided that it contains the remaining portion of the DCT coefficient data of a block n+1 as shown by a code sequence 85. If the code sequence 85 is not a variable length code defined according to MPEG, an error is detected in a decoding process. Even if the error might not be detected just after the connection point but the appearance or the order of the headers does not conform to a rule, an error occurs and is detected. When the error is detected, it is decided that data being decoded is ineffective, and the header of the following I picture is searched and a code sequence at a point (a point B) shown in FIG. 10(c) is decoded. Although the I picture contained in the part 310 has been selected and supplied to the decoding apparatus, an image corresponding to the I picture is not reproduced and displayed.
It is an object of the present invention to provide a decoding apparatus and a decoding method which are capable of decoding a code sequence to obtain complete information with the use of data being decoded even if an error is detected in a process for decoding a bitstream of a layered (hierarchical) structure.
Other objects and advantages of the invention will become apparent from the detailed description that follows. The detailed description and specific embodiments described are provided only for illustration since various additions and modifications within the spirit and the scope of the invention will be apparent to those skill in the art from the detailed description.
According to a first aspect of the present invention, a decoding apparatus comprises: decoding means which receives a first code sequence of a hierarchical structure as an input, decodes a code sequence of a selected first layer and higher layers in the first code sequence, outputs a detection signal when an error is detected while the code sequence is being decoded, and detects a second code sequence indicating the start of the first layer and higher layers in a set code sequence parsing position of the first code sequence; and means for setting a code sequence parsing position which receives the detection signal and sets the parsing position of the code sequence in such a way that code sequence parsing performed by the decoding means is returned from a point where code sequence parsing is being performed to a point where code sequence parsing has been performed in the first code sequence. Therefore, serious degradation in the quality of a reproduced image caused by an error occurring in a selected code sequence, can be avoided.
According to a second aspect of the present invention, in the decoding apparatus of the first aspect, the decoding means nullifies decoded data of the first layer being decoded and lower layers, after detecting the second code sequence. Therefore, serious degradation in the quality of a reproduced image caused by an error occurring in a selected code sequence, can be avoided. Besides, pictures selectively supplied to the decoding apparatus can be reproduced with reliability.
According to a third aspect of the present invention, in the decoding apparatus of the first aspect, the decoding means, after detecting the second code sequence, performs a predetermined process for data which is being decoded and will be obtained by decoding a code sequence of a layer lower than the first layer which will appear in the first code sequence, based on decoded data, in order to complete data of the layer lower than the first layer. Therefore, serious degradation in the quality of a reproduced image caused by an error occurring in a selected code sequence, can be avoided. Besides, an image in which an error has occurred can be reproduced with reliability.
According to a fourth aspect of the present invention, a decoding method comprises: a decoding step which receives a first code sequence of a hierarchical structure as an input, and decodes a code sequence of a selected first layer and higher layers in the first code sequence, wherein a parsing position of a code sequence is set in such a way that code sequence parsing is returned from a point where code sequence parsing is being performed to a point where code sequence parsing has been performed in the first code sequence, and a second code sequence indicating the start of the first layer and higher layers is detected in the set parsing position of the first code sequence, when the error is detected while the code sequence is being decoded. Therefore, serious degradation in the quality of a reproduced image caused by an error occurring in a selected code sequence, can be avoided.
According to a fifth aspect of the present invention, in the decoding method of the fourth aspect, the decoding step nullifies decoded data of the first layer being decoded and lower layers, after detecting the second code sequence. Therefore, serious degradation in the quality of a reproduced image caused by an error occurring in a selected code sequence, can be avoided. Besides, pictures selectively supplied to the decoding apparatus can be reproduced with reliability.
According to a sixth aspect of the present invention, in the decoding method of the fourth aspect, the decoding step, after detecting the second code sequence, performs a predetermined process for data which is being decoded and will be obtained by decoding a code sequence of a layer lower than the first layer which will appear in the first code sequence, based on decoded data, in order to complete data of the layer lower than the first layer. Therefore, serious degradation in the quality of a reproduced image caused by an error occurring in a selected code sequence, can be avoided. Besides, an image in which an error has occurred can be reproduced with reliability.