(1) Field of the Invention
The present invention relates to a picture decoding method for properly decoding coded data containing an error.
(2) Description of the Related Art
Recently, with the advent of the age of multimedia which handles audio, pictures and other pixel values in an integrated manner, existing information media, such as newspapers, journals, TVs, radios and telephones and other means, through which information is carried to people, have come under the scope of multimedia. Generally speaking, multimedia refers to a representation in which not only text but also graphics, audio, particularly pictures, and the like are simultaneously linked together. However, the information for the above existing information media must first be digitized before it can be handled as multimedia information.
However, the estimated storage capacity required to store the information carried by each of the above information media when it is converted to digital data is only 1 or 2 bytes per character for text, but 64 Kbits for one second of (telephone quality) audio, and 100 Mbits for one second of video (at current television receiver quality). It is therefore not practical to handle these massive amounts of information in digital form on the above information media. For example, video telephony service is available over Integrated Services Digital Network (ISDN) lines with a transmission speed of 64 Kbit/s to 1.5 Mbit/s, but television camera grade video cannot be sent as it is over the ISDN lines.
Data compression therefore becomes essential. Video telephony service, for example, is implemented using video compression techniques internationally standardized in International Telecommunication Union, Telecommunication Standardization Sector (ITU-T) Recommendations H.261 and H.263. Using the data compression techniques defined in MPEG-1, video information can be recorded together with audio information on a conventional audio compact disc (CD).
The Moving Picture Experts Group (MPEG) is an international standard for compressing moving picture (video) signals. MPEG-1 is a standard that enables compression of a video signal to 1.5 Mbps, that is, compression of information in a television signal approximately to a hundred times less (1:100) than the original size. The medium picture quality is targeted in the MPEG-1 because the transmission speed for MPEG-1 video is limited to approximately 1.5 Mbit/s. Therefore, MPEG-2, which was standardized to meet the demand for even higher picture quality, enables compression of a moving picture signal to 2 Mbit/s to 15 Mbit/s. Furthermore, MPEG-4 with an even higher compression rate has also been standardized by the working group (ISO/IEC ITC1/SC29/WG11) that has advanced the standardization of MPEG-1 and MPEG-2. MPEG-4 not only enables coding, decoding and operations on a per-object basis, it also introduces a new capability required in the multimedia age. At first, MPEG-4 was developed for the purpose of the standardization of a coding method for a low bit rate. However, it has been extended to a more versatile coding method including coding of interlaced pictures. MPEG-4 AVC and ITU-T H.264 have been standardized as a next generation coding method with a higher compression rate from a collaboration of ISO/IEC and ITU-T.
In coding of a moving picture, information is usually compressed by removing spatial and temporal redundancies. Therefore, inter-picture prediction coding, which aims at reducing the temporal redundancy, estimates motions and generates a predicted image on a block-by-block basis with reference to forward and backward pictures, and then encodes a differential value between the obtained predicted picture and a current picture to be coded. Here, a “picture” is a term to represent a single picture and it represents a “frame” when used for a progressive picture whereas it represents a “frame” or a “field” when used for an interlaced picture. The interlaced picture here is a picture in which a single frame consists of two fields having different times. For coding and decoding an interlaced picture, a single frame can be handled as a frame, as two fields, or as a frame or as two fields while switching between a frame structure and a field structure on a block-by-block basis in the frame.
A picture on which intra-picture prediction coding is performed without referring to any picture is called an I-picture. A picture on which inter-picture prediction coding is performed by referring to only one picture is called a P-picture. A picture on which inter-picture prediction coding is performed by referring to two pictures is called a B-picture. A B-picture can refer to two pictures in two directions. These two pictures can be selected from an arbitrary combination of forward and/or backward pictures in display order. Reference pictures can be specified for each block which is a basic unit for coding and decoding, but they are distinguished as the first reference picture for a reference picture that is described first in coded data and as the second reference picture for a reference picture that is described later. However, in order to use pictures as reference pictures for coding or decoding these I, P and B pictures, the reference pictures need to be already coded or decoded.
A motion compensation inter-picture prediction coding is employed for coding P-pictures and B-pictures. Motion compensation inter-picture prediction coding is a coding method applying motion compensation to inter-picture prediction coding. Motion compensation is not a scheme to simply generate a predicted picture using pixel values of a reference picture. It is a scheme for improving prediction accuracy and reducing data amount by estimating a motion amount (to be referred to as a “motion vector” hereinafter) of each part within a picture to make prediction in consideration of the motion vector. For example, data amount is reduced by estimating a motion vector for a current picture to be coded and coding a prediction residual between the current picture and a predicted picture obtained by shifting a reference picture by the amount equivalent to the motion vector. In the case of using this scheme, since the motion vector information is needed for decoding, the motion vector is also coded and then recorded or transmitted (see, for example, Japanese Laid-Open Patent Application No. 05-153574 and Japanese Laid-Open Patent Application No. 06-189284).
FIG. 1 is a block diagram showing a structure of a picture decoding apparatus using a conventional picture decoding method.
Coded data Str is decoded by a variable length decoding unit VLD, and error information err, motion information mv, a coding mode mode and a prediction error coeff are outputted. The error information err is information that indicates whether or not each block in the coded data Str contains an error. The motion information mv is information necessary for inter-picture motion compensation and contains information that indicates a motion vector and a picture referred to for inter-picture motion compensation. The coding mode mode is information that indicates whether a block has been intra coded or inter coded. The prediction error coeff is a coded prediction error in intra-picture or inter-picture prediction and is information that indicates the size of the prediction error.
An inverse quantization unit IQ performs inverse quantization on the prediction error coeff and outputs the inversely quantized error to the inverse orthogonal transformation unit IT. The inverse orthogonal transformation unit IT performs inverse orthogonal transformation on the error and outputs the resulting data to the addition unit Add.
When the block is inter coded, a motion compensation unit MC performs motion compensation, based on the information indicated by the motion information mv, on the pixel values of a reference picture outputted from a picture memory PM, and outputs the resulting data to a switch Sel1.
When the block is intra coded, an intra-picture prediction unit IP performs intra-picture prediction of the output from an addition unit Add and outputs the resulting data to the switch Sel1.
The switch Sel1 selects the output from the intra-picture prediction unit IP if the coding mode mode indicates intra-picture coding, and it selects the output from the motion compensation unit MC if the coding mode mode indicates inter-picture coding.
The addition unit Add adds the output of the inverse orthogonal transformation unit IT and the output of the switch Sel1. The output of the addition unit Add is the decoded pixel values to which a deblocking filter has not yet been applied, and is used as reference pixel values for intra-picture prediction performed by the intra-picture prediction unit IP.
The deblocking filter DBF applies deblocking filtering on the output from the addition unit Add by referring to the motion information mv, the coding mode mode and the prediction error coeff, so as to remove blocking distortion and generate a decoded picture Vout.
In the case where the error information err indicates that there is no error in a block, the decoded picture Vout is stored as it is into the picture memory PM. In the case where the error information err indicates that there is an error in a block, an error-concealed pixel block generated, by the error concealment unit EC, from a decoded picture stored in the picture memory PM, instead of the decoded pixel values of the block, is stored into the picture memory PM.
The output from the picture memory PM is used as reference pixel values for the motion compensation unit MC when the next picture is decoded.
FIG. 2 shows an example of a method for selecting a deblocking filter. There are four types of filters from a filter 0 to a filter 3. The filter 0 has the highest smoothing level, the filter 1 has the next highest smoothing level, and the filters 2 and 3 have respective smoothing levels in descending order.
More specifically, since blocking distortion is more obvious in an intra coded block than an inter coded block, the filter 0 with the highest smoothing level is applied to the boundary between two adjacent blocks when at least one of the blocks has been intra coded. Next, since blocking distortion is more obvious in a block that contains an inter-picture prediction error than a block that contains no inter-picture prediction error, the filter 1 with the rather higher smoothing level is applied when at least one of the adjacent blocks contains an inter-picture prediction error (namely, the coded coefficients of that block are included in the coded data). Furthermore, since blocking distortion is obvious when two adjacent blocks are different from each other in their motion information mv (such as motion vectors), the filter 2 with the rather low smoothing level is applied. When the motion information mv of the two adjacent blocks is identical, the filter 3 with the lowest smoothing level is applied. Note that the use of the filter 3 includes no application of a deblocking filter.
As mentioned above, the filter 0, the filter 1, the filter 2 and the filter 3 can be switched depending on the degree to which blocking distortion is obvious.
FIG. 3 is a flowchart showing deblocking filters to be selected in the conventional picture coding method, and shows an implementation example of the method for selecting one of the deblocking filters shown in FIG. 2.
In Step 10, it is judged whether or not at least one of two consecutive blocks has been intra coded (using the information of the coding mode mode). When at least one of the blocks has been intra coded, deblocking filtering is performed using the filter 0 in Step 15, and then the process is ended. When both the blocks have been inter coded, a judgment is made in Step 11.
In Step 11, it is judged whether or not the coefficients of at least one of the blocks have been coded (using the information of the prediction error coeff). When the coefficients of at least one of the blocks have been coded, deblocking filtering is performed using the filter 1 in Step 16, and then the process is ended. When the coefficients of neither block have been coded, a judgment is made in Step 12.
In Step 12, it is judged whether or not the motion information of these blocks is different from each other (using the information of the motion information mv). When the motion information of one block is different from that of the other block, deblocking filtering is performed using the filter 2 in Step 17, and then the process is ended. When this is not the case (i.e., when the motion information of one block is identical to that of the other block), deblocking filtering is performed using the filter 3 in Step 18, and then the process is ended.
FIG. 4 is a diagram for explaining one example of error concealment for concealing pixel values of a block that contains an error. The pixel values of a block 171 which cannot be properly decoded due to an error in a target decoded picture are replaced with the pixel values of a block 172 which is co-located in a reference picture referred to for inter-picture prediction. Such generation of pixel values of a block that contains an error using replacement or the like, as shown in this example, is called “error concealment”.
FIG. 5 is a block diagram showing a structure of another picture decoding apparatus using a conventional picture decoding method. Since some of the processing units in the block diagram of the conventional picture decoding apparatus in FIG. 5 operate in the same manner as those in the block diagram of the conventional picture decoding apparatus in FIG. 1, the identical reference numbers are assigned to such processing units and a description thereof is not repeated here.
In FIG. 1, the variable length decoding unit VLD detects error information err from coded data Str. However, in some types of applications or implementations of devices, the error information err can be obtained, not from the coded data Str but from a device which receives the coded data Str. In such a case, the error information err is given from outside, as shown in FIG. 5.
However, any conventional picture decoding methods using deblocking filters do not disclose what kind of decblocking filter should be used for a block that contains an error. As for a block in which an error has occurred, how blocking distortion looks like should vary depending on the error. Therefore, even if an error occurs, a less degraded decoded picture should be able to be reconstructed by switching a deblocking filter adaptively.