Field of the Invention
Embodiments of the present invention relate generally to computer processing and, more specifically, to a technique for performing arbitrary width integer arithmetic operations using fixed width elements.
Description of the Related Art
In computer systems, in general, and in graphics processing units (GPUs), in particular, a great number of arithmetic operations are performed on both floating point numbers and integer numbers. Floating point numbers are typically 32 bits in width. The 24 least significant, or right-most, bits, referred to as the mantissa, represent the value of the number, and the 8 most significant, or left-most, bits represent the exponent. A minimal element of floating point arithmetic is called a fused floating point multiply-add (FFMA), which performs the function of multiplying two inputs and adding a third input to the resulting product. An FFMA unit first performs the multiply operation on the mantissas of the two inputs and shifts the product according to the relative value of the exponents prior to performing the final addition using the third mantissa.
To accommodate the 24 bit values of the two inputs, the multiplier in the FFMA is 24 bits wide. Signed and unsigned integer numbers are typically 32 bits wide, where all 32 bits represent the value of the integer. In order to perform the basic multiply-add function on 32 bit integer inputs, a 32 bit multiplier is required. When it is required to perform both floating point and integer arithmetic in a computer system or GPU, past implementations have often relied on a dedicated 32 bit multiplier add element to perform integer arithmetic in addition to conventional FFMA elements. Alternatively, the multiplier in a conventional FFMA can be augmented to be 32 bits wide, with the additional 8 bits gated off when the FFMA is performing 24 bit multiplications.
One drawback to the above approach is that the addition of dedicated 32 bit multipliers incurs an undesirable increase in the required microcircuit real estate in the system as well as an increased leakage and overhead power loss as the increased size entails longer connecting wires and the resulting increased resistive and capacitive losses. Alternatively, increasing the size of the FFMA multiplier to 32 bits allows a common element to be used for both floating point and integer operations but suffers from the same drawbacks, in that real estate usage and overhead power is increased even if unused bit portions are gated off during floating point operations.
As the foregoing illustrates, what is needed in the art is a more effective approach to performing integer and floating point arithmetic operations.