Instruction latency minimization is an integral part of high-performance processor design. Double digit clock cycles are typically required for an operation like integer division, which computes a quotient and a remainder. Simpler integer operations execute in the low single digits of clock cycles. There is an ongoing need to improve processing latency to enhance overall processor performance.