Increasing demand for high definition TV products, including interactive TV in a HD format and HD video compression encoding and decoding, requires increasing sophistication, flexibility, and performance in the supporting electronics. The sophistication, flexibility, and performance requirements for HD TV exceeds the capabilities of current generations of processor architectures by, in many cases, orders of magnitude.
The demands of video encoding for HD formats are both memory and data processing intensive, requiring efficient and high bandwidth memory organizations coupled with compute intensive capabilities. In addition, a video encoding product must be capable of supporting multiple standards each of which includes multiple optional features which can be supported to improve image quality and further reductions in compression bandwidth. Due to these multiple demands, a flexible parallel processing approach must be found to meet the demands in a cost effective manner.
A number of algorithmic capabilities are generally common between multiple video encoding standards, such as MPEG-2, H.264, and SMPTE-VC-1. Motion estimation/compensation and deblocking filtering are two examples of general algorithms that are required for video encoding. To efficiently support motion estimation algorithms and other complex programmable functions which may vary in requirements across the multiple standards, a processor by itself would require significant parallelism and very high clock rates to meet the requirements. A processor of this capability would be difficult to develop in a cost effective manner for commercial products.
Motion estimation/compensation methods exploit the temporal picture structure of a video sequence by reducing the redundancy inherent in the sequential picture frames. They represent a central part of the video encoding process of MPEG-4 AVC H.264 and SMPTE-VC-1 video encoding standards. These methods are computationally the most expensive part of a digital video encoding process. On average they take about 60-80% of the total available computational time, thus having the highest impact on the speed of the overall encoding. They also have a major impact on the visual quality of encoded video sequences.
The most common motion estimation/compensation algorithms are block matching algorithms operating in the time domain. Here motion vectors are used to describe the best temporal prediction for a current block of pixels to be encoded. A time domain prediction error between the current block of pixels and the reference block of pixels is formed, and a search is performed to minimize this value.
The process of half pixel motion search refinement for HD TV picture frame sizes is computationally very intensive due to the requirement to apply a 6-tap FIR filter on a two-dimensional array of pixels in order to produce half pixel values. In addition, an intermediate step of transposition of the two-dimensional array is needed which requires memory addressing on pixel boundaries, thus demanding a prohibitive number of extra computational cycles from a processor in the array.
It will be highly advantageous to efficiently address such problems as discussed in greater detail below.