Target devices such as field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and structured ASICs are used to implement large systems that may include million of gates and megabits of embedded memory. The complexity of a large system often requires the use of electronic design automation (EDA) tools to create and optimize a design for the system onto physical target devices. Among the procedures performed by EDA tools in a computer aided design (CAD) compilation flow is hardware description language (HDL) compilation. HDL compilation involves performing synthesis, placement, routing, and timing analysis of the system on the target device.
Division is a commonly used arithmetic operation. Among the commonly used classes of division algorithms are those that perform sequential division and those that perform fully parallel division. Sequential division requires multiple clock cycles where every clock cycle calculates just a few of the bits of the quotient. Fully parallel division requires a single clock cycle where every clock cycle computes a quotient, from a dividend and a divisor.
In order to perform fully parallel division using a reasonably desired clock frequency, designers are required to implement heavily pipeline the circuitry used for the division operation by adding many pipeline stages. The pipelining allows the circuitry to operate at a higher frequency, but increases propagation delay and the amount of logic required.