The present invention relates to timing circuits for multiplier and divide/square root units in floating point systems.
A typical floating point system will have both a multiplier (MUL) and an arithmetic logic unit (ALU). Multiplication requires the summation of many partial products, which makes the time to complete a MUL operation longer than the time for an ALU operation. Some systems require the clock cycle time to be as fast as possible which results in multiplication taking more cycles than addition. Other systems require a simple programming model with the same number of cycles for both multiplication and ALU operations. This results in more clock cycles being allocated for an ALU operation than are necessary.
In one architecture, a divide/square root unit shares the same input stage as the multiplier unit and provides its result to succeeding stages of the multiplier unit. Such a divide/square root unit typically requires a fixed number of cycles for an operation, so that control circuitry clocks in the operands and then clocks out the result after counting the required number of cycles.
A multiplier may have three stages in one system, an input stage, an output stage and an intermediate stage with a fixed-point half-array multiplier. A register at the beginning of the output stage has its output fed back to a multiplexer in the intermediate stage to allow a second pass through the half-array for double-precision floating-point and integer multiplications. Single-precision and mixed-precision multiplications require only one pass through the array.