This invention relates to coding of video signals, and more particularly to improving the coding performance of low bit-rate video coders which process the pels in each video frame on a block-by-block basis.
The CCITT (International Telegraph and Telephone Consultative Committee) in the past few years has been in the process of drafting its Recommendation H.261 which defines a method for video signal coding at p.times.64 kb/s where p=1,2, . . . 30 (see e.g., Working Party XV/1, CCITT, "Draft revised Recommendation H.261--video codec for audiovisual services at p.times.64 kbits/s," DOC. COM XV-R 17-E, Jan. 1990; and "Video Codec for Audiovisual Services at p.times.64 kbits/s," CCITT Recommendation H.261, CDM XV-R 37-E, International Telegraph and Telephone Consultative Committee (CCITT), August 1990). H.261 is now a standard, having been formally approved in December, 1990. At the low rate end (p=1, or 2), a major application envisioned is videophone service over the Integrated Services Digital Network (ISDN). The H.261 video coder uses a hybrid coding approach (H. G. Musmann, P. Pirsch, and H-J. Grallert, "Advances in picture coding," Proc. IEEE, vol 73, no. 4 , pp. 523-548, Apr. 1985) where interframe redundance is exploited by motion-compensated differential pulse code modulation (DPCM) and the resulting signal is coded in the discrete cosine transform (DCT) domain.
Two spatial resolutions have been adopted for H.261: CIF (common intermediate format), having 288 lines.times.352 pels per line, and QCIF (quarter CIF), having 144 lines.times.176 pels per line. These resolutions apply to the luminance component of a color image. In each case, resolution of the two chrominance components is 1/4 of that of the luminance, being half in both the horizontal and vertical directions. Each video frame is divided into "macro blocks" for coding where each macro block contains 16.times.16 pels, or more precisely, 16.times.16 luminance pels and two times 8.times.8 chrominance pels. For each macro block, motion estimation is first performed and then the predicted result is divided into six 8.times.8 blocks (4 for the luminance component and one each for the two chrominance components) for DCT and subsequent quantization and coding. For data compression efficiency, variable-wordlength codes are used extensively in the video multiplex coder. This necessitates the use of a buffer to hold the coded quantities before transmitting them through a fixed-rate channel. For framing and other purposes, macro blocks in a video frame are grouped into group-of blocks (GOBs) with each GOB consisting of 33 macro blocks (3 rows of 11 macro blocks). The GOBs are transmitted in sequence and within each GOB, the coded macro blocks are transmitted one by one in natural order (i.e., row by row and, within each row, block by block from left to right).
During the development of the H.261, a series of reference coding algorithms called "reference models" were established. A recent model is called Reference Model 8, or RM8 and is described in "Description of reference model 8 (RM8)," Document 525, CCITT Study Group XV, Working Party XV/4, Specialists Group on Coding for Visual Telephony, Jun. 9, 1989. In the RM8 coding algorithm a two-dimensional variable length code (2-D VLC) is used, in which the runlength of the number of zero coefficients preceding a non-zero quantized coefficient and the magnitude of the non-zero coefficient are coded. Quantization of the coefficients is by means of a quasi-uniform quantizer with adjustable step-size controlled by the buffer level. A variable threshold is applied to the coefficients to increase the number of zero coefficients. Although the RM8 coding algorithm performs reasonably well over a wide range of bit rates, image degradation is visible at low bit rates such as 64 or 128 kb/s. This degradation arises because the low bit rates force the quantization to be coarse.
A typical videophone scene consists of a portrait of the conversing party with some foreground and background. As the person in the scene moves, parts of the background are covered and uncovered. The moving portion of the scene often spans a region extending across several macro blocks both horizontally and vertically. The aforenoted RM8 reference coding algorithm specifies that the quantizer step-size be adapted according to the buffer level once per row of macro blocks in a GOB. Two effects of this adaption process have been observed. Firstly, in the still segments of a scene the buffer level is likely to drop to zero, thereby not fully utilizing the available channel capacity. Secondly, and more importantly, in a contiguous set of block rows which contain moving objects, there may exist one or more rows for which the buffer level at the end of the row is significantly higher than at the beginning. Thus the respective next rows are quantized more coarsely than their preceding spatial neighbors. As a result, different parts of the same object which spans several block rows may get coded with significantly different quantization levels causing noticeable image degradation. Although adapting the quantizer step-size once per macro block rather that once per block row will somewhat ease the problem, significant degradation is still likely to be observed on the right side of the image in "busy" moving areas.
An object of the present invention is to improve the coding performance of a low bit-rate coder that performs block processing of the pels of a video signal.