One near universal aspect of video compression standards creates problems. Most video compression standards divide each input field or frame into blocks and macroblocks of fixed size. Pixels within these macroblocks are considered as a group without reference to pixels in other macroblocks. A typical technique involves transformation of the pixel data into a spatial frequency domain, such as via a discrete cosine transform (DCT). This frequency domain data is quantized and encoded from low frequency to high frequency. Most of the energy in the frequency domain data is usually concentrated in the low frequencies. Thus an end of block symbol enables truncation of coding high frequency symbols. The resulting quantized data is typically entropy coded. In entropy coding more frequently used symbols are coded with fewer bits than less frequently used symbols. The net result is a reduction in the amount of data needed to encode video.
This coding in separate macroblocks can create coding artifacts at the block and macroblock boundaries. Because adjacent macroblocks may be encoded differently, the image may not mesh well at the macroblock boundary. For example, other features of the macroblock may cause a different quantization parameter. Upon decoding, the same color or gray-scale value at the macroblock boundary may be displayed differently based upon this different quantization.
Recently the H.264 standard has proposed deblock filtering at the block boundaries for both encoding and decoding. This deblocking can enhance the perceived image quality by reducing blocking artifacts based upon block and macroblock encoding. The deblocking technique adopted in this standard requires an extensive decision matrix to determine whether to filter on block edges and which filter to employ. The standards group has published proposed program code to implement this deblocking. The proposed program code includes extensive conditional branching. This makes the code unsuitable for deeply pipelined processors and application specific integrated circuit (ASIC) implementations. In addition, this proposed program code exposes little parallelism. This makes this proposed program code unsuitable for very long instruction word (VLIW) processors and parallel hardware implementations. This is particularly unfortunate in the case of VLIW processors, which are otherwise well suited to video encoding/decoding applications.