I. Field of the Invention
This invention relates generally to computer technology, and more particularly, to efficient hardware design.
II. Background Information
Calculating intermediate values between two numbers is a common task. For example, intermediate values are required to blend colors in video processing applications. In addition, calculating intermediate values is common in hardware compression and decompression algorithms. Since the task of calculating intermediate values involves multiplying by a fraction, it consists mainly of dividing by the denominator of the fraction. Every fraction whose denominator is odd has an endless sequence of digits in the fraction's binary representation. For example, 1/7=0.001001001 . . . binary, ⅓=0.010101 . . . binary, and ⅖=0.011001100110 . . . binary. When the denominator is an odd number, the division might be inaccurate even when using common rounding techniques, since the binary fraction is an endless sequence of digits. A binary number is a base 2 number (i.e., 1001=20+23=9).
One method for rounding the result is to add ½ (0.1binary) to the total and truncating the remaining digits after the decimal point, so that fractions greater than or equal to ½ are rounded up, and fractions less than ½ are rounded down (i.e., the value is not changed). For example, 1.1binary would round up to 2 (1.1binary+0.1binary=10.0binary), whereas 1.01binary would round down to 1.0 (1.01binary+0.1binary=1.11binary and the remaining digits after the decimal point are truncated).
Adding ½ and truncating is not accurate enough when the denominator is odd so one method for improving accuracy is to add a lookup table to round the result, placed either before or after the division required to calculate the intermediate value. The problem with this and other methods for improving accuracy is that implementing them requires a large number of gates which is costly in terms of the chip area and processing time.
A blending function calculates an intermediate value. The blending function has the following formula: round(P*X+(1−P)*Y), where X and Y are integers, and P is a fraction between 0 and 1. The blending function is used, for example, to blend colors of three dimensional (“3-D”) objects.
Referring to the figures in which like numerals indicate like elements, FIG. 1a shows the hardware involved in a typical, intuitive implementation of the blending function. A multiplier 100 (in the figures, the multiplier is abbreviated to “mul” in order to conserve space and maintain clarity) is used to multiply the inputs together producing a multiplier output. An adder 110 is used to add the inputs together producing an output. The ½ is added to round off the result.
FIG. 1a shows how the typical hardware implementation for the blending function calculates a result given that X and Y are 12-bit inputs, and P is a fraction with a denominator of 5 and a numerator between 0 and the denominator (i.e., 1, 2, 3, or 4). In FIG. 1a, multiplier 100 to the left of adder 110 and is the top-most is used to multiply X and P's numerator. P's numerator can be either 1, 2, 3, or 4. Multiplier 100 to the left of adder 110 and to the bottom is used to multiply Y and (1−P)'s numerator. (1−P)'s numerator can be either 4, 3, 2, or 1. Adder 110 adds the results of these two multiplications together and sends the result of the addition to multiplier 100 located to the right of adder 110. This multiplier multiplies the result from adder 110 with the binary representation for ⅕ which is approximately 0.0011001100110011. Note that the result from this multiplication is accurate to 12 bits. One-half is added to this result for rounding purposes. The digits after the decimal point can be truncated to produce a rounded whole number result.
The blending function and numerous other functions multiply a value by a fraction. Multiplying a value by a fraction uses the formula: round(P*X) where X is a binary integer, and P is a fraction between 0 and 1. The same problems that hinder the implementation of the blending function also hinder the implementation of multiplying a value by a fraction (e.g., the implementation requires an excessive number of gates).
FIG. 1b shows the hardware involved in a typical, intuitive implementation for multiplying a value by a fraction. FIG. 1b shows how this hardware cooperates in order to multiply a value by a fraction given that the value, X, is a 12-bit input, and the fraction, P, has a denominator of 5 and a numerator between 0 and the denominator. In FIG. 1b, multiplier 100 on the left side multiplies together X and P's numerator which can be either 1, 2, 3, or 4. Multiplier 100 on the right side multiplies this result with the binary representation for ⅕ which is 0.0011001100110011. One-half is added to this result for rounding purposes. The digits after the decimal point can be truncated to produce a rounded whole number result.
The area required for the most efficient hardware implementation of the blending function shown in FIG. 1a is estimated to require 2415 gates (a gate on average has about 4 transistors). Multiplier 100 requires a much larger number of gates than adder 110. The computation time to perform the operation in FIG. 1 a is determined by the path from X or Y through multiplier 100 on the left side of the page, adder 110, and multiplier 100 on the right side of the page. Multiplier 100 is usually implemented using several adders in series thus resulting in a long computation time. Therefore, in general, a hardware implementation of an operation using adders is more efficient in terms of space and computation time than a hardware implementation using multipliers.