The invention relates to compression encoding of information streams and, more particularly, the invention relates to content-adaptive compression encoding of information streams to selectively provide information emphasis and/or de-emphasis.
In several communications systems the data to be transmitted is compressed so that the available bandwidth is used more efficiently. For example, the Moving Pictures Experts Group (MPEG) has promulgated several standards relating to digital data delivery systems. The first, known as MPEG-1 refers to ISO/IEC standards 11172 and is incorporated herein by reference. The second, known as MPEG-2, refers to ISO/IEC standards 13818 and is incorporated herein by reference. A compressed digital video system is described in the Advanced Television Systems Committee (ATSC) digital television standard document A/53, and is incorporated herein by reference.
The above-referenced standards describe data processing and manipulation techniques that are well suited to the compression and delivery of video, audio and other information using fixed or variable length digital communications systems. In particular, the above-referenced standards, and other xe2x80x9cMPEG-likexe2x80x9d standards and techniques, compress, illustratively, video information using intra-frame coding techniques (such as run-length coding, Huffman coding and the like) and inter-frame coding techniques (such as forward and backward predictive coding, motion compensation and the like). Specifically, in the case of video processing systems, MPEG and MPEG-like video processing systems are characterized by prediction-based compression encoding of video frames with or without intra- and/or inter-frame motion compensation encoding.
In an MPEG-like motion video compression process, it is known to use a quantity called Mquant to determine the quality of the encoding of each 16 picture element (pixel) by 16 pixel macroblock in a video frame or picture. That is, a rate-control process determines a value of Mquant for each macroblock in a picture to produce (at a decoder) the best quality picture without exceeding the available bit budget. Lower values of Mquant, which result in higher bit allocations and correspondingly better picture quality, are typically assigned to macroblocks within regions of low activity (i.e., low luminance frequency), since the human eye is more sensitive to such low frequency video information. Similarly, higher values of Mquant, which result in lower bit allocations and correspondingly lower picture quality, are typically assigned to macroblocks within regions of high activity (i.e., high luminance frequency), since the human eye is less sensitive to such high frequency video information.
Unfortunately, the mechanical assignment of Mquant and other qualitative parameters may produce encoded pictures having regions of special interest encoded with less than desired quality. For example, an advertisement for, e.g., a soap manufacturer may comprise an image of a person holding up a bar of soap. From the perspective of a soap manufacturer, the details of the bar of soap are very important, while the details of the person holding the soap are less important. Thus, the important image region is that image region including. the bar of soap, rather than the image region including the person. However, if the above-described mechanical assignment of Mquant values is used, the qualitative emphasis may be misdirected to the person rather than the bar of soap.
Therefore, it is seen to be desirable to provide content-adaptive encoding of information, such as video information. Specifically, it is seen to be desirable to provide content-adaptive encoding of information effectuating a selective enhancement or degradation of information quality, such as video information quality.
The invention comprises a method and concomitant apparatus for providing selective enhancement and/or degradation of an information frame using content-based, regional analysis techniques. In general, the invention provides a subjective evaluator that delineates regions of an information space to be encoded in a qualitatively preferential or non-preferential manner such that the encoded information space comprises one or more of normal, emphasized or de-emphasized information content.
Specifically, the invention comprises a method for selectively encoding an information stream comprising a plurality of information frames, comprising the steps of: generating, in response to a subjective evaluation of the contents of an information frame, a mask indicative of a desired encoding quality adjustment for one or more information regions within said information frame; and associating each of said one or more information regions with respective encoding quality adjustment indicia; encoding said information frame in accordance with said encoding quality adjustment indicia.
In one embodiment of the invention, an operator delineates high value and/or low value regions of an information frame, illustratively a video frame. High value portions of the information frame are encoded at an enhanced quality level, while low value portions of the information frame are encoded at a degraded quality level.
In another embodiment of the invention, the quality enhancement and/or degradation is effected using one or more of several techniques including of a bit allocation method and a regional and sub-regional filtering method.