1. Field of the Invention
The invention relates to video compression and encoding methods, and more specifically to video compression methods that employ techniques to increase efficiency, compactability, and transmission of digital image and video data.
2. Description of Related Art
Digital pictorial information, whether derived from an analogue source by a process of digitization or directly from a digital device, consists of huge volumes of data. As the ability of devices to capture higher resolution images improves so too does the amount of data required for their digital representation. If stored in raw format a single image may well require tens of mega-bytes of disk space.
The problem is further exacerbated when considering digital video data, especially for high definition video. A two-hour movie when stored in raw form at the highest resolution ATSC frame size (1920×1080 pixels at 30 frames per second) requires almost 641 Gbyte of disk space. At a data rate of almost 89 Mbyte/s the bandwidth required for transmission goes way beyond what is currently available.
The encoding operation may be considered to be a three-stage process. First, a block predictor, created from data already available to the decoder, is subtracted from the original data to form a prediction error signal. Second, the prediction error is block transformed and quantized. Finally, the transform coefficients are entropy coded to form a binary bitstream that constitutes the compressed frame.
The prediction stage may involve spatial or temporal prediction for video. For image compression, with no available temporal data, the only prediction mode available is spatial.
Many of the more successful algorithms have a two-dimensional block transform method at their core, partitioning each frame into rectangular blocks (usually 8×8 or 4×4) and applying the transform to each. Compression is achieved by coding the transform coefficients more efficiently than the original spatial data can be coded.
The Discrete Cosine Transform (DCT) has received the most attention over the last thirty years or so, being the transform of choice in all of the MPEG video compression and the original JPEG image compression International Standards.
Another aspect of the invention covers the ability to reuse prior transmitted motion vectors, which may not appear directly adjacent to the current block, and to use statistics on these prior transmitted motion vectors to lessen the cost of encoding new motion vectors.
Motion fields tend to track real objects that move from one frame to the next. These objects typically cover more than the typical block size. There is usually reasonable consistency of motion vectors from one block to the next. Prior art makes use of this consistency by predicting a new motion vector from the motion vectors of the surrounding blocks and then encoding the difference between the real motion vector and the predicted motion vector. The prior art also uses a smaller subset of blocks in the prediction, typically four surrounding motion vectors (left, above left, above, and above right).
In the prior art, digital image/video compression systems use various techniques of prediction to reduce data redundancy. In block-based systems, to efficiently encode a block of pixels, a prediction block is constructed based on previously decoded data. That prediction block is subtracted from the source data and the residual signal is encoded using techniques such as transform coding. At the decoder the prediction block may be created from data that has already been decoded and the prediction error signal added back in to produce the reconstructed block.
The terms intra- and inter-prediction indicate that the prediction block is formed from data from the same image/video frame and previously decoded frame(s), respectively.
Sub Pixel motion estimation is used to build a prediction of a block that has moved from one frame to the next by something other than a whole pixel value. In sub-pixel motion estimation, the system attempts to estimate what would have happened to the block if the real object moved by a non-integral amount.
The prior art used a fixed set of interpolating filters to predict ½, ¼ and even ⅛ pixel moves. The problem with this technique is two fold: the longer the filter is the more likely you are to reproduce an image artifact and two shorter filters perform a less accurate interpolation and thus tend to blur real image detail.
The prior art, including some standards based codecs such as H.264, describes the use of various types of filters for smoothing the discontinuities that arise between blocks coded using discrete cosine transforms (DCT) or other similar block based transforms.
The problem with conventional loop filters is that they typically either fail to adequately remove false block discontinuities or over smooth the reconstructed image and hence suppress real image detail.
This invention relates to an improved method for loop filtering that includes adaptive techniques that maximize the beneficial effects of the filter and minimize the artifacts.