The present invention relates to devices for, and processes of, coding video signals using adaptive quantization which, in preferred embodiments, can be readily adapted for H-263+ and MPEG-4 video standards.
Coding of digital video signals for transmission over a transmission channel, or for storage in a storage medium, has become widely practiced in a variety of contexts, especially in multimedia environments. Such contexts include video conferencing, video games, Internet image transmissions, digital TV and the like. Devices used in these contexts typically include program-controlled coding processors (e.g., personal computers, hardware boards, multimedia boards and other suitable processing devices) for carrying out processes of coding (or decoding) video signals.
Video signals generally include coded data corresponding to one or more video frames, wherein each video frame is composed of an array of picture elements (pels). A typical color video frame can be of a variety of resolutions. In one common resolution known as quarter common interface format (QCIF), such as represented in FIG. 1, a video frame is composed of over twenty-five thousand picture elements (pels), arranged in an array of 144 pels xc3x97176 pels, where each pel has color (or hue) and luminance characteristics. Digital signals representing such video frame data can involve a relatively large number of bits. However, in many contexts, the available bandwidth for transmitting such signals is limited, or the available storage space (memory) is limited. Therefore, compression coding processes are used to more efficiently transmit or store video data.
Standards for compression coding processes have been developed by various organizations. For example, the International Telecommunication Union (ITU) has developed H series standards, which are typically used for real-time communications, such as videophones. Also, the International Organization for Standardization (ISO) has developed the Motion Picture Experts Group (MPEG) series standards, such as MPEG-1 and MPEG-2.
Compression coding processes in accordance with such standards typically involve a quantization step, in which sampled video signal data values are represented by (or mapped onto) a fixed number of predefined quantizer values, typically with some loss in accuracy. For example, consider the video input signal represented as a sine wave ranging between 0 and 255. As the video data signal is sampled, each data sample will have a data value of one of the 256 values possible within the above-noted range. If the number of bits allowed by the system (or the transmission channel bandwidth) for coding these values is limited to, for example, five, then the 256 video signal data values must be mapped to (and represented by) 25=32 quantizer values. In this manner, the quantized signal will be composed of quantizer values that are, in fact, estimates (with some loss in accuracy) of the sampled video signal values. For example, depending upon the quantization scheme, the quantized value selected from the set of 32 values to represent (map) a particular video signal data value is typically the quantizer value within the set that is the closest to the video signal data value or, alternatively, the quantizer value that is the closest value higher than the video signal data value or the closest value lower than the video signal data value. In any case, unless the video signal data value just happens to be equal to a quantizer value, the quantized value of the video signal data value will be an estimate of (but not equal to) the actual video signal data value.
Ordinarily, when more bits are available for coding each video signal value, more quantizer values are available in the quantizer set and, therefore, higher accuracy in the compression coding (i.e., lower distortion) may be achieved. However, as noted above, communication systems and transmission channels function with limited bandwidths and data storage devices have limited storage capacities. Therefore, the number of bits available for coding each signal value (and, thus, the number of quantizer values available) is typically limited. Accordingly, typical modem systems must function with a quantization scheme that maps the video signal data onto a limited number of mapped (quantized) values and, thus, necessarily produces some loss in accuracy (distortion).
Some improvements in accuracy and distortion may be made by designing the quantizer set to include those values that have been found (by experimentation) to have the highest probability of being selected. However, the quantizer values that have the highest selection probability in one portion of a video frame are typically different from the quantizer values that have the highest selection probability in another portion of the frame, and, in a moving video context, such values differ from frame-to-frame as well. Typical coding systems employing current H.263 video standards and MPEG-4 verification models do not have the ability to change quantizers as much as one would like, to adapt to such statistical changes.
It is an object of the preferred embodiments of the present invention to provide an improved compression coding system and process which are capable of providing improved accuracy (less distortion) and/or improved bit rate characteristics. It is a further object of preferred embodiments of the invention to provide such a system and process which adapts to the statistical changes of the imagery in a video signal while maintaining the basic structure of the quantization processes of H 263+standard and MPEG-4 video verification model.
These and other objects are achieved, according to preferred embodiments, with a system and process for coding video signals, employing an adaptive quantization scheme. In preferred embodiments, quantization is carried out by selecting a quantizer (a set of quantization values) from a plurality or group of predefined quantizers (sets of quantization values), for each frame or, more preferably, for each frame portion (e.g., each 16xc3x9716 block or 8xc3x978 block of pels, also referred to as a macro block, in the video coding context). In this manner, the system or process is capable of changing to a different quantizer for different video frames or, preferably, for different portions of the same frame.
The quantizer selection is based on a determination of which quantizer provides the best distortion and bit rate characteristics among the quantizers available in the selection group for the portion of the video signal being coded. In preferred embodiments, the selection is based on a formula which takes into account both the distortion (accuracy) and bit rate. The quantizer that exhibits the best combined distortion and bit rate characteristics is selected for coding the frame or frame portion. In preferred embodiments, a similar formula, based on both distortion and bit rate characteristics, is used to select the particular quantization value within the quantizer set for each video signal value being coded. As a result, significantly improved overall coding- efficiency and accuracy were observed.