1. Field of the Invention
The invention is related to video compression systems, methods, and computer program products.
2. Description of the Related Art
Transmission of moving pictures in real-time is employed in several applications like e.g. video conferencing, net meetings, TV broadcasting and video telephony.
However, representing moving pictures requires bulk information as digital video typically is described by representing each pixel in a picture with 8 bits (1 Byte). Such uncompressed video data results in large bit volumes, and can not be transferred over conventional communication networks and transmission lines in real time due to limited bandwidth.
Thus, enabling real time video transmission requires a large extent of data compression. Data compression may, however, compromise with picture quality. Therefore, great efforts have been made to develop compression techniques allowing real time transmission of high quality video over bandwidth limited data connections.
In video compression systems, the main goal is to represent the video information with as little capacity as possible. Capacity is defined with bits, either as a constant value or as bits/time unit. In both cases, the main goal is to reduce the number of bits.
The most common video coding method is described in the MPEG* and H.26* standards. The video data undergo four main processes before transmission, namely prediction, transformation, quantization and entropy coding.
The prediction process significantly reduces the amount of bits required for each picture in a video sequence to be transferred. It takes advantage of the similarity of parts of the sequence with other parts of the sequence. Since the predictor part is known to both encoder and decoder, only the difference has to be transferred. This difference typically requires much less capacity for its representation. The prediction is mainly based on picture content from previously reconstructed pictures where the location of the content is defined by motion vectors. The prediction process is typically performed on square block sizes (e.g. 16×16 pixels). In some cases however, predictions of pixels based on the adjacent pixels in the same picture rather than pixels of preceding pictures are used. This is referred to as intra prediction, as opposed to inter prediction.
The residual represented as a block of data (e.g. 4×4 pixels) still contains internal correlation. A well-known method of taking advantage of this is to perform a two dimensional block transform. The ITU recommendation H.264 uses a 4×4 integer type transform. This transforms 4×4 pixels into 4×4 transform coefficients and they can usually be represented by fewer bits than the pixel representation. Transform of a 4×4 array of pixels with internal correlation will probability result in a 4×4 block of transform coefficients with much fewer non-zero values than the original 4×4 pixel block.
Direct representation of the transform coefficients is still too costly for many applications. A quantization process is carried out for a further reduction of the data representation. Hence the transform coefficients undergo quantization. The possible value range of the transform coefficients is divided into value intervals each limited by an uppermost and lowermost decision value and assigned a fixed quantization value. The transform coefficients are then quantified to the quantization value associated with the intervals within which the respective coefficients reside. Coefficients being lower than the lowest decision value are quantified to zeros. It should be mentioned that this quantization process results in that the reconstructed video sequence is somewhat different compared to the uncompressed sequence.
As already indicated, one characteristic of video content to be coded is that the requirements for bits to describe the sequence is strongly varying. For several applications it is well known for a person skilled in the art that the content in a considerable part of the picture is unchanged from frame to frame. H.264 widens this definition so that parts of the picture with constant motion can also be coded without use of additional information. Regions with little or no change from frame to frame require a minimum number of bits to be represented. The blocks included in such regions are defined as “skipped”, reflecting that no changes or only predictable motion relative to the corresponding previous blocks occur, hence no data is required for representing these blocks other than an indication that the blocks are to be decoded as “skipped”. This indication may be common to several macroblocks.
The present standards of video coding (H.263/H.264, both of which are incorporated herein by reference) are very efficient in reducing bit rate, still maintaining reasonable overall subjective image quality. The errors introduced are mostly acceptable from a subjective point of view even if the objective error of the reconstructed image is similar to what we would get if we used pixel representation and reduced the number of bits/pixel from 8 to 4.
There are numerous examples within the prior art that discloses techniques for pixel prediction or quantization processes, so as to reduce the subjective noise. From U.S. Pat. No. 6,037,985 to Wong it is known a method for video compression; this method is dealing with defects caused by compression. This publication is focusing on subjective quality within single windows, usually referred to as intra-picture. The tools used in this publication for improvement of subjective quality within single windows are:                Estimate “noise immunity” based on sharp edges in proximity of smooth surfaces, this technique has long been known to a person skilled in the art.        The result from the previous estimation is used so as to affect the Q-factor, the adjustment of the Q-factor is the only constructive feature referenced.        Traversing the picture in several passes (steps) so as to adjust the Q-factor to each single macroblock (MB) resulting in a “target” bit count for the picture and further take into account the estimated “noise immunity” so as to better the subjective quality of the picture.        
However U.S. Pat. No. 6,037,985 does not propose any solution to annoying subjective noise due to movements from one picture to the next. From U.S. Patent Publication 2002/0168011 by Bourge, it is disclosed another familiar method of detecting noise in a flow of video data coded by macroblock according to a predictive block-based encoding technique. However this method does not solve problems caused by movement from one picture to the next.
In the following, explanations of mechanisms causing severe visibly annoying effects will be given, effects that are not addressed by the above referenced publications.
However, it is well known that certain types of video material tend to cause visibly annoying artifacts. The problem is particularly related to edges or transitions between relatively “flat” areas in the image, i.e. areas of relatively uniform pixel values. The problem is generally related to errors introduced by the quantization of transform coefficients, but it is more evident in the case of a moving object disclosing a smooth background. The problem is illustrated in the left-hand side of FIG. 1. As the dark part is moving, it leaves behind some black “rubbish” in a background area that is supposed to be uniformly light.
These phenomena have been well known as long as the present prediction/transform coding has been in use. The phenomenon has sometimes been called “dirty window effect”. Similar phenomena have also been referred to as “mosquito effect”.
Another similar effect is related to the block segmentation used in connection with coding as described earlier. The block coding is a powerful overall tool for compression, and takes the advantage of correlation between neighboring pixels. However, the method is not well suited to treat singular pixel values. This is reflected in the case where a block mainly includes “flat” content, except from one or a few pixels close to an edge or a corner of the block that differs significantly. This can typically happen if the block just touches a different object. In such situations the singular pixels can be left unchanged resulting in annoying black or white spots near the block border in the decoded frame e.g. when the different object start moving away from the block. The problem is illustrated on the right-hand side of FIG. 1. Here, a black area, originally in a stationary position marginally covering one of the corners of a block, is about to move away from the block. The change due to the movement is so small that the block still will be indicated as “skipped”, and thus, the block will remain unchanged. However, the changes appear to be greater in the adjacent block, and are therefore correctly updated (note that the above described “mosquito effect” may appear anywhere within a block independent of the presently discussed effect). The result is a remaining black corner well visible in the middle of a light area. This problem is referred to as a “corner problem”.