1. Field of the Invention
The present invention relates generally to computer arithmetic and more specifically to performing integer division using floating-point units.
2. Description of the Related Art
Many current computer processors do not incorporate integer division logic into the digital circuit design of their arithmetic logic units (ALUs) because integer division operations tend to be infrequent operations that do not justify the hardware expense to incorporate such logic. As such, integer division is typically implemented in software, utilizing algorithms that leverage arithmetic operations that are available in the ALU, such as addition, subtraction, multiplication, logical operations (AND, NOT, OR, XOR), and bit-shifting. For example, a classic “shift and subtract” algorithm for integer division utilizes only addition, subtraction, compare (i.e., AND operation) and shifting operations and mimics well-known long division techniques.
FIG. 1A depicts a flowchart of a classic “shift and subtract” algorithm for integer division. Utilizing the terms “N” (i.e., numerator or dividend) and “D” (i.e., denominator or divisor) to represent a division operation, N÷D, in step 100, as in traditional long division, D is left-aligned with N (e.g., visually, D can be left-aligned and written under N). In step 105, the digits of D are compared against the corresponding digits of N (e.g., above D) to determine the leftmost digit (i.e., most significant) of the quotient. If D is greater than the corresponding N digits in step 110, the corresponding N digits are subtracted from D (step 115), the result of the subtraction replaces the corresponding N digits in N, creating a new N (step 117) representing the remaining dividend, and the current quotient digit is incremented (step 119). Once D is less than the corresponding N digits above it and D is less than the current version of N (step 120), D is shifted to the right one digit (step 125) and the next quotient digit is determined by repeating the process of steps 105 through 119 until D is larger than the current N (i.e., remaining dividend) in step 120.
FIG. 1B depicts a visual representation of flowchart of FIG. 1A when executing the division operation 175÷12 in binary form. As can be seen, the various loops in FIG. 1B depict that various iterations through the loops created by steps 105-110-115-117-119 and steps 105-110-120-125 and mimic the traditional long division steps displayed in 130. Due to these iterations, a software implementation of integer division utilizing the shift and subtract algorithm or other similar division techniques can consume a significant amount of computing time and cycles. It should be noted that there exist other known integer division algorithms that improve on this classic “shift and subtract” technique, for example, by utilizing “oracles” to eliminate repetitive iterations such as those created by steps 105-110-115-117-119; however, even with such improved techniques, a pure software implementation of such techniques also consume significant amounts of computing time and cycles.
In contrast, floating point division is an operation that is typically provided in the digital circuit design of floating point units (FPUs) of processors. As such, floating point division is often significantly faster than integer division because floating point division is implemented in the hardware while integer division is implemented at the software level. For example, certain commercial processors report that integer division in software for a 32 bit integer consumes 39 cycles while floating point division in hardware for a double precision float (64 bits) consumes only 20 cycles.
Depending upon the format used for floating point numbers in a computing system, integer division can be performed by converting the integers into the floating point format and then executing a floating point division in the FPU. FIG. 1C depicts various integer and floating point formats used in a typical computing system. Format 135 represents a single precision floating point in accordance with the IEEE 754 standard (hereinafter, also referred to as a “float”). As can be seen, the format of a float is subdivided into 3 sections: 1 bit is a sign bit, 8 bits represent the exponent, and 23 bits represents the mantissa or fractional value. The mantissa contains the digits of value in the float, in this case, 23 bits of precision (24 bits of precision, implicitly, in accordance with the IEEE 754 standard). As such, integer division can be performed on integers that utilize less then 24 bits, such as an 8 bit unsigned integer 140, by converting or casting the integers into floats (i.e., inserting the 8 bits into the mantissa) and executing the division as a floating point division through an FPU. The float result can be converted or cast back into an integer without loss of precision. However, a 64 bit unsigned integer 145 would not fit into the 24 bits of mantissa of float (see 150). Specifically, 40 bits of precision would be lost if a 64 bit unsigned integer 145 was converted into a 32 bit float.
As the foregoing illustrates, what is needed in the art is a technique for performing higher precision (e.g., 64 bit) integer division operations with a low precision (e.g., 32 bit) hardware operation, such as a division operation in a floating point unit that only supports floating point formats (e.g., 32 bit float with 24 bits of precision, etc.) whose mantissas are significantly smaller than the bit size of the integers (e.g., 64 bits).