Processing compressed digital video requires an enormous amount of computational horsepower. Modern central processing units (CPUs) are not keeping up with the demand for horsepower, resulting in the slow processing of video compression and processing tasks. This result has different ramifications for different users. For high-end professionals and broadcast infrastructure applications, specialized hardware is currently used. This hardware is produced in low volumes and thus tends to be expensive. On the other hand, for video editing hobbyists and average consumers, it is unusual to purchase expensive hardware to augment an off-the-shelf personal computer. Instead, these users rely entirely on the computer's CPU to sequentially perform the tasks. This runs much slower than real-time, causing the user to wait long periods of time for basic operations like converting a video file from one format to another.
Moving to a parallel architecture has the potential to accelerate many of these tasks. However, significant parallelization is difficult to achieve because block-based codec algorithms require some serialization due to the requirement for neighbor blocks to have been coded prior to the current block. Multiple blocks cannot be processed at the same time because each relies on information from neighboring blocks, which may not have been processed yet. Operations such as intra prediction, motion estimation and compensation, and deblocking are just a few examples of block-based algorithm calculations which rely on neighboring blocks. Utilizing stream processor architectures with conventional algorithms provides no performance increase for these operations.
The need remains therefore for improvements in video processing to achieve improvements in performance, especially speed, while leveraging relatively low-cost hardware. Several preferred examples of the present application will now be described with reference to the accompanying drawings. Various other examples of the invention are also possible and practical. This application may be exemplified in many different forms and should not be construed as being limited to the examples set forth herein.