1. Field of the Invention
Embodiments of the present invention relate to Fine Granularity Scalability (FGS) video coding/encoding, and more particularly, to a video encoder/decoder and a video encoding/decoding method and medium, and a video packet structure.
2. Description of the Related Art
Scalable coding methods are in great demand to perform general coding operations for still or moving pictures.
With the advent of mobile communications services and wireless Internet, everybody can make communications using picture information at any time in any place. People also typically desire to obtain remote picture information using information appliances connected to various types of computers, such as, laptop computers, personal digital assistants (PDAs), and the like. In addition, in the future, various types of information appliances will appear. However, the future Internet appliances will naturally provide different decoding capabilities or transmission environments because their terminals will naturally have different characteristics or adaptation environments.
To solve this problem, MPEG-4 (Moving Picture Experts Group-4) standards provide techniques for providing pictures of various qualities depending on the circumstances or performances of terminals which receive the resultant encoded pictures. For example, if a receiver terminal has excellent performance and a good transmission line, the terminal may be able to receive and display a high-quality moving picture. On the other hand, if the receiver terminal has poor performance and a bad communications line, the terminal will not be able to receive a high-quality picture. To cover these two cases, a video codec, such as MPEG-2, MPEG-4, H.263, or the like, is designed to perform scalable coding.
Scalable picture coding denotes production of a scalable bitstream by an encoder so that a receiver can receive pictures of various qualities from the encoder. In other words, if a transmitting bitstream is scalable, various types of receivers can be used. For example, low performance receivers can receive and display an average-quality picture bitstream encoded in a basic layer. High performance receivers can receive and display a basic layer picture bitstream and a high-quality picture bitstream encoded in an enhancement layer.
Scalable coding methods are roughly classified into a spatial scalable coding method or a temporal scalable coding method. The spatial scalable coding method is used to improve the spatial resolution of a picture, step by step. The temporal scalable coding method is used to gradually increase the number of pictures displayed on a time axis per unit time. To perform each of the spatial and temporal scalable coding methods, an MPEG-4 encoder uses at least one enhancement layer to transmit a bitstream to a receiver. When a moving picture is encoded using an enhancement layer, a basic layer basically codes a picture with a spatially- and temporally-low resolution and transmits the low quality encoded information to a receiver. The enhancement layer can be coded and transmit information to obtain the picture with an greater resolution.
In scalable coding methods, a picture is coded using two or three layers, so only two or three bit rates can be obtained. However, since some transmission lines, such as the wired/wireless Internet, are not stable, their bandwidth varies arbitrarily. Thus, MPEG-4 is standardizing a fine granularity scalability (FGS) coding method so that an encoder can flexibly adapt to a change in the bit rate provided by a transmission line.
In the FSG coding method, to achieve efficient picture restoration even when a receiver receives only a part of a video bitstream from a transmitter, a video bitstream with its quality being enhanced by an enhancement layer is transmitted bit plane by bit plane. In other words, the current FGS coding method is similar to existing scalable coding methods in that when a transmitter transmits an enhancement layer bitstream to a receiver, the only difference between the original picture and a picture transmitted by a basic layer is the quality of the transmitted picture. However, in the FGS coding method, picture information to be transmitted from the enhancement layer is divided into bit planes, and the most significant bit (MSB) of each bit plane is preferentially sequentially transmitted to the receiver followed by the next MSB. Accordingly, even when the receiver cannot receive all bits necessary for picture restoration due to a change in the bandwidth of a transmission line, it can restore the transmitted picture to some degree, even using only the received bits. However, the MPEG-4 FGS coding method excessively increases the total number of bits due to bit-plane coding.
MPEG-21 SVC (scalable video encoding) is also under standardization as an FGS coding method. MPEG-21 DIA (Digital Item Adaptation) is also under standardization as an FGS coding method, based on the premise that a proxy node located in the middle of a network can arbitrarily change a bit rate.
Current FGS coding methods as described above provide a remarkably low peak signal-to-noise ratio (PSNR) compared to existing single-layer encoding methods, which provides the same bit rate as that provided by the FGS coding method. Also, the current FGS coding methods require existing coding methods to be changed to achieve FGS.
Hereinafter, some conventional data-partitioning methods will be described.
First, ISO/IEC 13818-2 (7.10) deals with data partitioning, in which block data is divided into two parts based on priority breakpoints, the number of which is 67. When a block data is divided into two parts, as specified in MPEG-2, two-layers of encoding are possible, but FGS is not possible.
Second, ISO/IEC 4496-2:2003(E) (E.1.2) deals with data partitioning, which is used as an error-resistant instrument. In data partitioning based on the MPEG-4 motion marker, motion vectors and headers of all of the macroblocks existing in video data are arranged in a front part of a video packet, and motion markers for resynchronization are then arranged and followed by texture information (i.e., DCT coefficients) about all of the macroblocks. This data partitioning is conceptually smaller than the MPEG-2 Part 2 data partitioning and also cannot achieve FGS.
Third, ITU-T Rec. H.264|ISO/IEC 144496-10 AVC (7.3.2.9 and 7.4.2.9) discuss syntax and semantics of slice data partitioning RBSP. A joint video technique (JVT), in which data is divided into 3 categories, similarly cannot achieve FGS.
Thus, these data-partitioning methods cannot be used for the purpose of achieving FGS. Further, current standardized FGS coding methods need a large amount of overhead to achieve FGS, thereby providing a PSNR significantly lower than that of an existing single-layer encoding method which provides the same bit rate as the current FGS coding methods.
FIG. 1 illustrates a conventional single-layer encoding/decoding sequence, in which items included in a video packet are encoded. Referring to FIG. 1, the video packet includes first through N-th macroblocks. Each of the macroblocks includes a macroblock header (MBH) and six blocks. Each of the blocks is made up of items and an end-of-block (EOB) item which is disposed at the end of the items.
According to the conventional single-layer encoding/decoding method of FIG. 1, a first block of the first macroblock is encoded, followed by a second block, etc. After encoding of the first macroblock is completed in the above manner, the second macroblock is encoded. In other words, after all of the items of the first block of the first macroblock, that is, after all first items and an item EOB, are encoded, items of the second macroblock are encoded. If encoding is performed in the above manner, the first and N-th macroblock are disposed at the head and rear of the video packet, respectively, and corresponding items for each block are evenly distributed within the video packet.
However, items in a front part of each block include more significant information, and items in a rear part of each block include less significant information. Also, an item EOB at the end of each block stores EOB, which simply indicates the end of a block.
Nevertheless, since block items are somewhat evenly distributed over a video packet in the conventional encoding method of FIG. 1, if a rear part of the video packet is cut off, macroblock information stored in the cut-off part of the video packet is completely lost. Thus, decoding of only the received portion of the video packet may impede proper reproduction of an original signal.