The present invention relates generally to digital computer processing, and more particularly to a circuit and method of accelerating the computer implementation of the Sweeney-Robertson-Tocher ("SRT") algorithm for carrying out floating-point division and square root operations.
Advances in computer processing have occurred primarily as a result of improved logic design and more efficient computational processes. While nearly all computer-implemented algorithms are based on a hierarchy of relatively simple operations, not all are performed by a computer at the same speed. What distinguishes computer operation of various algorithms in terms of speed is the way in which the algorithm is implemented in available hardware. The speed by which a computer can perform an algorithm depends in large part on the efficiency of the digital logic on which the algorithm is implemented and in the way the computation is structured.
Division is arguably the most complex of the arithmetic operations to implement because it typically involves speculating or guessing the digits of the quotient. For example, M. Ercegovac, in Digital Systems and Hardware/Firmware Algorithms (1985), describes an algorithm of two positive integers designated as dividend Y and divisor X resulting in quotient Q and an integer remainder Z having the relation Y=XQ+Z. The division process is defined by the following recurrence relationship: EQU z.sup.(0) =Y EQU z.sup.(j+1) =rz(j)-Xr.sup.n Q.sub.n-1-j forj=0, . . . , n-1.
This relationship yields EQU z.sup.(n) =r.sup.n (Y-XQ) EQU Y=XQ+z.sup.(n) r.sup.-n
The dividend Y contains 2n digits and the divisor X has n digits to produce a quotient Q with n digits and a remainder Z=z.sup.(n) r.sup.-n. A quotient digit is selected such that 0.ltoreq.Z.ltoreq.X at each step (j, j+1, etc.) in the division process. The quotient digit selection process is a crucial part of the division algorithm.
One well-known division algorithm is the so-called "SRT" algorithm developed by Sweeney, Robertson, and Tocher. The SRT algorithm can be used by a computer to calculate the quotient of two integers or floating-point numbers as well as the square root of a single integer or floating-point number. While the SRT division algorithm can be performed either by hardware or software, hardware implementation is generally preferred due to faster speeds and lower costs.
The SRT algorithm can be accelerated by data processing techniques such as pipelining and parallel processing. Pipelining increases the level of concurrency (i.e., the number of activities performed simultaneously) in an algorithm by breaking a large task into multiple smaller tasks carried out concurrently by sequential stages of the single hardware unit. Parallel processing, on the other hand, increases the amount of concurrency by performing larger tasks simultaneously in separate hardware units.