This invention relates to coding techniques for the transmission of video information, and more particularly to coding methods for low bit-rate transmission of video information.
As Integrated Switched Digital Network (ISDN) service is implemented, it is predicted that a new service demand will be created for video telephony. As each subscriber will have the availability of 2B channel capacity, with each B channel being 64 Kb/s, transmission of motion video coupled with its associated audio over one or two of the B channels is desirable. In order to achieve such a low bit-rate video transmission, image compression techniques must be employed to achieve acceptable video quality. In recent years, due to the dramatic improvements in integrated circuit process technology, the field of digital image and video processing has been experiencing tremendous growth. Advances in high capacity memory chips and VLSI technology have created a new horizon in low cost implementations of complex video compression algorithms making the transmission of video signals with an acceptable quality at very low rates feasible.
Various image compression techniques are well known in the art such as DPCM and transform coding, as described by H. M. Musmann, P. Pirsch and H. J. Gravoert, in "Advances in picture coding," Proc. IEEE, vol. 73, pp. 523-548, April 1985. For low bit-rate motion video signals the combination of the two, known as hybrid coding, is considered to be the most efficient compression method. Descriptions of this hybrid coding scheme are described by H. Habibi, in "An adaptive strategy for hybrid image coding," IEEE Trans. Commun., vol. COM-29, pp. 1736-1740, December 1981; by W. Chen and W. K. Pratt, in "Scene adaptive coder," IEEE Trans. Commun., vol. COM-32, pp. 225-232, March 1984; and by S. Okubo, R. Nicol, B. Haskel and S. Sabri, in "Progress of CCITT standardization on n.times.384 kbits/s video codec," Globcom-87.
In an interframe hybrid coder, each block of pel data, each element of which digitally represents the magnitude of a picture element, is element-to-element compared with corresponding elements in a reconstructed coded block from the previous frame. The resultant block of difference data is transformed using a two-dimensional transform algorithm such as a two-dimensional discrete cosine transformation, and the coefficients in each block of data are quantized and entropy coded for transmission over the data channel. At the transmitter, each block and thereby the entire frame is reconstructed by inversely transforming the quantized coefficients and adding them to the corresponding reconstructed pel elements of the previous frame. A frame memory stores the reconstructed pel elements for the next block-by-block differential comparison with the pel elements in the next video frame. Similarly, at the receiver the entropy encoded data stream is decoded and an inverse transformer reconstructs the quantized differential pel elements of each block which are added to the pel elements of the previous frame to form the pel elements of the present frame.
The coding efficiency of an interframe coder can be further improved by using motion compensation prediction methods such as described by T. Koga, K. Iinuma, A. Hirano, Y, Iiijima, and T. Ishiyuro in "Motion-compensated interframe coding for video conferencing," in Proc. NTC 81, pp. G5.311-G5.3.5. When using such methods, each block of data is characterized at the input of the encoder as a static block or a dynamic block, determined as a function of the magnitude of the difference data between the present block and the corresponding block from the previous frame. If the difference data is greater than a threshold, the block is characterized as dynamic and the previous frame is scanned to locate a the block that most closely matches the present block. Difference data is then formed between the present block and the "matching" block in the previous frame. The transformed coefficients of each block are then transmitted by the encoder together with overhead information that includes a motion vector that indicates the shift in position of the block between frames.
The coding efficiency of a hybrid encoder depends on the coding of the transform coefficients, the effectiveness of the motion compensated prediction, and the size of the transform block. Advantageously, a large block size achieves better compression since less overhead information need be transmitted per video frame of data since fewer blocks are required to be transmitted per frame. Disadvantageously, however, as the block size increases, the complexity of the circuitry required to perform the transformation of each block dramatically increases. Furthermore, as the block size increases, there is increased subjective degradation in the decoded video signal noted by the presence of block distortion in which the viewer perceives the outlines of the blocks.
An object of the present invention is to combine the advantages of both large and small block transformation in a hybrid type coder.
An additional object of the present invention is to entropy encode the coefficients of the transformed block data in as an efficient manner as possible and thereby transmit the video signal at a low bit rate.