Generally, a compiler is a computer program that translates programs expressed in a high-order language to their machine language equivalents. In the language conversion process, a Signed Integer Divide (SID) may be performed. By its nature, the result of a SID may have to be rounded.
Referring to FIG. 1, a typical compiler in its relation to a computer system is shown. First, a source program 100 is input into the compiler 102 where it is therein converted to machine executable code 104 for use in a computer's hardware system 106. Associated with the compiler 102 are typically a set of registers 108 for transferring numeric values in and out, a code generator 110 for generating compiler arithmetic code 112 that includes divide code 114 which further includes a code for dividing by a constant power of two 116, and also source code execution circuitry 118. Each of these features allows the compiler 102 to reduce the source code in source code RAM 100 to machine executable code 104 for use in the hardware system 106.
When dividing integers, rounding occurs to the nearest whole number, rounding toward zero. Thus, for a positive number, a small offset is generally subtracted from the number. In the case of a negative number, a small offset is generally added in order to round the number. This function can be performed in many ways.
One way to perform SIDs is to use branch logic along with a "true" divide operation. For example, if the input is positive, then a first sequence of instructions would be executed, otherwise, if the input is negative, then a second sequence of instructions would be executed. Executing a true floating point or integer divide, however, can take a number of clock cycles to complete. This can slow down a processor or computer system greatly.
The key, however, is that the divisor is always a power of two, i.e., R.sup.2N. Therefore, it is better to perform a logic or arithmetic shift of a register rather than performing a true divide. Integer shifts are single clock instructions on all conventional machines and, therefore, are a great advantage over doing true divides.
Trouble occurs, however, when a signed integer is divided. In the situation of a negative signed number, simply shifting the integer number will not produce the correct result. A negative number will need to be modified in order to take advantage of shifting to divide the number by a power of two.
One method to address the problem(s) associated with signed numbers is to precondition the inputs. Below is a conventional "optimized" code sequence that divides a number (R0) by a power of two (2.sup.N) and places the result in a register (R1). "S" is the size in bits of the registers involved. All four instructions are serialized in that each depends on the prior instruction's result. "N" is a compile-time known constant. Four clock cycles are required for this sequence. The code sequence is as follows:
______________________________________ 1) shift-right-arithmetic T1=R1, N-1 ;; produces N copies of the signed bit 2) shift-right-log T2=T1, S-N ;; moves the N copies of sign ;; bit to least-significant bits 3) add T3=R1, T2 ;; adds fudge factor to original input 4) shift-right-arithmetic R2=T3, N ;; shifts fudged value, giving result ______________________________________
The sequence, although optimized, requires at least four clock cycles to complete since each instruction depends on the prior instructions' result.
Now referring to FIG. 2, the optimized sequence for dividing a signed integer by a power of two (R1/2.sup.N) 200 is shown in a flow diagram. The basic concept is to generate a value based on "N" to be added to the original dividend so that performing the arithmetic-right-shift by N produces the correct result when the dividend is a negative number. This value, or dividing factor, is a positive (2.sup.N -1) if the dividend is a negative integer. The dividing factor is zero if the dividend is a positive integer.
Still referring to FIG. 2, the first step 202 in the prior art sequence is to perform an arithmetic-right-shift of the original dividend by (N-1) into a temporary register. This will produce (N-1) copies of the sign bit in the (N-1) high-order bits of the temporary register. In the next step 204, the value from step 202 is then logically-right-shifted by (S+1-N), where S is the number of bits in the temporary register. Step 204 produces (N-1) copies of the sign bit in the (N-1) low order bits of the temporary register. In the next step 206, this value is then added to the original dividend. In the final step 208, the result from step 206 is right-shifted by N to give the correct result in the event of a negative dividend.
This sequence has the advantage over branch logic in that it does not incur the risk of mispredicting a branch when choosing between the sequence pertaining to the negative dividend and the sequence pertaining to the positive dividend. Branches present a risk that, in the event a branch prediction is incorrect, time may be lost in recovering from the incorrectly predicted branch. However, the sequence of FIG. 2 is still limited by the clock cycles associated with the serial execution of each instruction (e.g., four cycles in the above example).