Transformation and quantization in block based video codecs introduces blocking artifacts at edges. Special optimized video filter called de-blocking filter is conditionally applied on 4×4/8×8 pixel block boundary to enhance visual quality and improve prediction efficiency. Most of the recent video codecs such as H.264, H.265 (HEVC), and VC-1 uses in-loop de-blocking filter in the decoder path. Each video codec standard defines fixed order of filter operation to have consistency in universal decoder output.
Standard defined fixed edge order is not optimal for various architectures of de-blocking filter hardware accelerator (HWA), as it will have to compromise on performance, power or area. In-loop de-blocking filter integrated in video processing engine running at macroblock (MB) level pipeline is challenging in handling MB boundary level pixels. Concurrent operation of loading and storing of unfiltered, partially filtered and fully filtered pixels in and out of the internal storage along with the filter operation are some of the challenges that are difficult to meet with standard defined edge order for filter operation without getting impacted due to stall from shared memory access.