A video stream is comprised of a sequence of video frames where each frame is comprised of multiple macroblocks. Each macroblock is typically a 16×16 array of pixels, although other sizes of macroblocks are also possible. Video codecs (COmpressor-DECompressor) are software, hardware, or combined software and hardware implementations of compression algorithms designed to encode/compress and decode/decompress video data streams to reduce the size of the streams for faster transmission and smaller storage space. While lossy, video codecs attempt to maintain video quality while compressing the binary data of a video stream. Examples of popular video codecs include WMV, RealVideo, as well as implementations of compression standards such as MPEG-2, MPEG-4, H.261, H.263, and H.264.
Under H.264 compression standards, a macroblock of a video frame can be intra encoded as a 16×16 pixel array, the pixel values of the array being predicted using values calculated from previously encoded macroblocks. A 16×16 macroblock can also be intra encoded as sixteen 4×4 pixel arrays, where pixel values in each 4×4 array are predicted using values calculated from previously encoded 4×4 arrays. There are 4 possible intra prediction modes for 16×16 arrays (luma blocks) and 9 possible intra prediction modes for 4×4 arrays (luma blocks).
As such, in encoding a macroblock, two determinations (selections) must be made: 1) whether the macroblock is to be encoded as a 16×16 array (referred to herein as 16×16 encoding) or as sixteen 4×4 arrays (referred to herein as 4×4 encoding), and 2) the predictive mode(s) to be used to encode the macroblock. For example, if it is determined that the macroblock is to be encoded as a 16×16 array, it must also be determined which of the four predictive modes for the 16×16 array is to be used. If it is determined that the macroblock is to be encoded as a sixteen 4×4 arrays, it must also be determined, for each of the sixteen 4×4 arrays, which of the nine predictive modes for the 4×4 array is to be used. Step 1 is referred to herein as encoding type selection and step 2 is referred to herein as predictive mode selection.
Encoding type selection and predictive mode selection are made using cost functions. For example, cost functions are typically used to determine whether a macroblock is to be encoded as a 16×16 array or as sixteen 4×4 arrays where the type of encoding (16×16 or 4×4 encoding) having the lower cost is chosen. Cost is typically equal to the distortion or the weighted average of distortion plus an estimate of the number of bits produced by the prediction mode, where an increase in distortion and/or number of bits increases the cost. Distortion reflects the difference between original pixel values and predicted (or encoded) values and can be measured in various ways. For example, distortion can be measured as the sum of the absolute differences between the original pixel values and predicted (or encoded) values.
An exhaustive search approach to selecting an optimal encoding type (16×16 or 4×4 encoding) and optimal predictive mode(s) for a macroblock involves determining costs of all four 16×16 prediction modes and all combinations of nine 4×4 prediction modes for sixteen 4×4 blocks in the macroblock, where a 16×16 prediction mode or a particular combination of 4×4 prediction modes that gives the lowest cost is selected. For each macroblock, the exhaustive search approach requires consideration of 9^16 different combinations of 4×4 prediction modes, rendering the exhaustive search approach practically infeasible.
As such, the following operations are typically performed to determine the encoding type and predictive mode(s) for a macroblock:                1) Compute the cost of all four possible 16×16 predictive modes.        2) For each of the sixteen 4×4 blocks, select the predictive mode (among the 9 predictive modes) having the lowest cost, and then compute the total cost of the resulting combination (i.e., the sum cost of the sixteen determined costs).        3) Compare the cost determined at step 1 with the cost determined at step 2 and select the lowest one. This selection provides both the encoding type selection and the predictive mode(s) selection.        
The conventional approach, however, still involves determining costs for 9×16 different combinations of the 4×4 predictive modes plus the costs for the four 16×16 predictive modes.