1. Field of the Invention
The present invention relates to calculating the average of four integer numbers, and more particularly to single instruction cycle calculation of the average of four signed or unsigned integer numbers with a correctly rounded result and without errors due to overflow of intermediate results.
2. Description of Related Art
The Moving Picture Experts Group (MPEG) standard has emerged as the dominant standard for compressed digital video. The MPEG-1 standard specifies a compressed video bit-stream rate of approximately 1.5 megabits per second, and a compressed stereo audio bit-stream rate of approximately 250 kilobits per second. The second standard, MPEG-2, will specify compression rates for higher-bandwidth distribution media, and is in the process of being formalized. To meet the MPEG standard, video and audio compression and decompression products must rapidly process the various algorithms used to implement the MPEG standard.
The MPEG standards impose the need for bi-directional temporal differential pulse code modulation (DPCM) and half pixel motion estimation. FIG. 1 shows an illustrative block of pixels X. In practice, a block contains more pixels than shown in FIG. 1, which is abbreviated for clarity. For example, typically in video signal processing, the basic video information processing unit is a macro-block, which has a 16.times.16 pixel matrix comprising four 8.times.8 luminance blocks, and two 8.times.8 chrominance blocks. Each macro-block is part of a much larger luminance or chrominance frame, as the case may be. In FIG. 1, the pixel X represents either luminance or chrominance, with the output corresponding to an unsigned integer number.
MPEG motion processing involves half pixel motion estimation as well as full pixel motion estimation. In FIG. 1, the "H" points represent horizontal interpolations, the "V" points represent vertical interpolations, and the "Y" points represent both horizontal and vertical interpolations. The interpolations "H" and "V" are calculated in accordance with the expression EQU (X.sub.1 +X.sub.2)/2 (1)
wherein X.sub.1 and X.sub.2 are horizontally contiguous pixels for the interpolation "H" and are vertically contiguous pixels for the interpolation "V." The interpolations "Y" are calculated in accordance with the expression EQU (X.sub.1 +X.sub.2 +X.sub.3 +X.sub.4)/4 (2)
wherein X.sub.1 and X.sub.3 and X.sub.2 and X.sub.4 are diagonally contiguous pixels. In expressions (1) and (2), the symbol "/" as specified by the MPEG standard represents integer division with rounding towards zero. In round towards zero, all non-integers are rounded to the next smallest integer. That is, the integer component is left intact and the fractional component is truncated. For instance, 7/4 and -7/-4 are rounded to 1, and -7/4 and 7/-4 are rounded to -1.
Expression (2) has been implemented by right-shifting X.sub.1, X.sub.2, X.sub.3 and X.sub.4 by two bits, summing the right-shifted operands to provide a result, obtaining a separate sum of the shifted-out bits, then rounding the result based on inspection of the sum of the shifted-out bits. While this is a simple operation, in some cases the result may need to increase by two or three for proper rounding in accordance with the MPEG standard. However, in a general purpose computer, if increment is the available operation in a single instruction cycle then several instruction cycles may be needed to perform several consecutive increments.
Another known implementation of expression (2) includes summing X.sub.1 and X.sub.2 to provide a first intermediate result, summing X.sub.3 and X.sub.4 to provide a second intermediate result, adding the first and second intermediate results to provide a third intermediate result, inspecting the two least significant bits of the third intermediate result, right-shifting the third intermediate result by two bits, and rounding the shifted result based on inspection of the two pre-shifted least significant bits of the third intermediate result. A drawback to this approach is that each of the three summing operations may require a separate instruction cycle. Furthermore, any of the three summing operations may produce an overflow that leads to an improperly rounded result.
In calculation intensive applications such as MPEG motion processing, it is highly desirable to calculate the average of four integers rounded towards zero in a rapid and efficient manner.