Recently, with the advent of the multimedia era where audio, images, and other pixel values are integrally handled, conventional information media as communication tools, including newspapers, magazines, TVs, radios, and telephones, have come to be under the scope of multimedia. Generally, multimedia refers to a representation in which not only text but also graphics, audio, and particularly pictures are simultaneously associated with one another. In order to handle such conventional information as multimedia, the information must be digitalized.
When estimating the amount of information in each of the pieces of information media by using an amount of digital information, the information amount per text character requires 1 to 2 bytes. Audio requires 64 Kbits per second (telephone quality); and furthermore, moving pictures require more than 100 Mbits per second (reception quality of current television). It is therefore not practical to handle these massive amounts of multimedia information in digital format. For example, a video phone is now in practical use via Integrated Services Digital Network (ISDN) with a transmission rate of 64 Kbits/s to 1.5 Mbits/s. However, images of TVs and cameras cannot be transmitted with the original digital information amount over the ISDN.
Therefore, information compression techniques are essential. For example, the video phone uses moving picture compression techniques compliant with H.261 or H.263 standard recommended by ITU-T (International Telecommunication Union-Telecommunication Standardization Sector). Using the information compression techniques compliant with MPEG-1, picture information can be stored together with audio information on a general music compact disc (CD).
The Moving Picture Experts Group (MPEG) refers to an international standard for compression of moving picture signals and has been standardized by the International Organization for Standardization and the International Electrotechnical Commission (ISO/IEC). MPEG-1 is a standard to compress moving picture signals down to 1.5 Mbits/s, that is, to compress information of TV signals approximately down to one hundredth. The transmission rate within the MPEG-1 standard is set to about 1.5 Mbits/s for a picture of medium quality. MPEG-2 is thus standardized to meet the need for higher picture quality. MPEG-2 enables transmission of moving picture signals at 2 to 15 Mbits/s to achieve television broadcast quality. Furthermore, at present, the working group (ISO/IEC JTC1/SC29/WG11) in charge of the standardization of MPEG-1 and MPEG-2 has standardized MPEG-4. The MPEG-4 achieves a compression rate higher than those of MPEG-1 and MPEG-2, and further enables coding and decoding operations on an object-by-object basis, and thereby achieves new functions necessary in this multi-media era. The initial object of MPEG-4 was to standardize a low-bit-rate coding method; however, the object has been extended to include a more versatile coding of moving pictures including interlaced pictures at a high bit rate. Moreover, at present, MPEG-4 AVC and ITU-T H.264 have been standardized as picture coding systems with higher compression rate through an ISO/IEC and ITU-T joint project.
Here, a picture signal can be considered as a sequence of pictures (also referred to as frames or fields) each of which is a group of pixels having the same time. Neighboring pixels in each picture have a strong correlation with each other; and thus, the correlation between the pixels within a picture is used in the compression. Furthermore, consecutive pictures have high correlation between pixels; and thus the pixel correlation between the pictures is also used in the compression. Here, compression using a correlation between pixels in different pictures and a correlation between pixels within a picture is referred to as inter-coding, whereas compression using the correlation between pixels within a picture without using the correlation between pixels in different pictures is referred to as intra-coding. The inter-coding that uses the correlation between pictures can achieve a compression rate higher than that of the intra-coding. In MPEG-1, MPEG-2, MPEG-4, MPEG-4 AVC, and H.264, each picture includes blocks (or macroblocks) each of which is a group of pixels in a two-dimensional rectangular area, and the inter-coding and the intra-coding are switched per block.
In the case of intra-coding, each picture includes blocks of predictive errors using the correlation between pixel blocks within a picture or the correlation between pixels within a picture. In the case of inter-coding, each picture includes blocks of predictive errors using the correlation between pictures. A two-dimensional orthogonal transform, such as a discrete cosine transform (DCT), is performed on such blocks included in each picture. The frequency components of the blocks on which the two-dimensional orthogonal transform has been performed are quantized with a predetermined quantization step (size). The quantized value is variable length coded and transmitted to, for example, a network.
In the quantization, a larger quantization step results in a higher compression rate with a larger picture coding distortion, and a smaller quantization step results in a lower compression rate with a smaller picture coding distortion. Here, the quantization step refers to the size of quantization, and indicates how precisely the quantization is performed.
In the case of coding moving pictures, quantization steps are generally calculated such that the bit rate indicating data amount per second is constant and the quantization steps within a picture are approximately the same. The bit rate is set as constant because the bit rate of transmitting coded streams over the network is constant. to The approximately constant quantization steps are set within a picture because if the degree of the coding distortion varies depending on the position within a picture (horizontal position, vertical position), that is, if the quantization steps within a picture are not uniform, the coding distortion is likely to be noticeable.
As described, in general, in the case of coding moving pictures, a control is performed such that a uniform quantization step is used within a picture. Such a control has an advantage that the coding distortion becomes uniform and the coding distortion is not likely to be noticeable at a certain position.
In networks, with widespread high-speed network environments using Asymmetric Digital Subscriber Lines (ADSLs) and optical fibers, general households can now transmit and receive information at a bit rate over several Mbits/s. Furthermore, it is likely that information can be transmitted and received at several tens of Mbits/s in the next few years. Thus, it is expected that with the picture coding technique, not only companies using dedicated lines but also general households will bring in video phone and teleconferencing systems of television broadcast quality, HDTV broadcast quality, and super HDTV quality in the future.
When coded picture data that is a stream is transmitted through a network, part of the stream may be lost due to, for example, network congestion. When the part of the stream is lost, a receiver cannot properly decode a picture corresponding to the lost part of the stream, thereby causing degradation in picture quality. In addition, in the subsequent pictures, decoding of streams inter-coded with reference to the picture corresponding to the lost part of the stream also involves degradation in picture quality. To prevent this, a coding unit which is a group of blocks is defined as a slice. A slice is a minimum unit of independent coding and decoding. Decoding can be performed per slice even when part of a stream is lost. For example, as shown in FIG. 22, there is a conventional technique where each picture is divided into slices in five rows. One of the slices is I-slice and the position of the I-slice within a picture cyclically shifts through temporally consecutive pictures. Here, the I-slice refers to an intra-coded slice. Even when part of a stream is lost due to network congestion or the like, the technique allows proper decoding at the receiver without being influenced by the lost of the part of the stream. In other words, the cyclic shift of the I-slice prevents errors caused due to network congestion or the like from influencing the decoding of subsequent pictures in inter-coding.
FIG. 23 is a diagram illustrating a relationship between slices and blocks in a slice division scheme in accordance with MPEG-2. A picture (one frame) shown in FIG. 23 includes a plurality of blocks. The blocks in a same row make up of a slice.
FIG. 24 is a diagram illustrating a coding order of blocks in a picture. The blocks in the picture shown in FIG. 23 are coded in the order indicated in FIG. 24, that is, the order from left to right within each slice and from a top slice to a bottom slice within a picture. The blocks in each picture are coded in such a coding order so that a stream is generated.