1. Field of the Invention
The present invention relates to electronic digital information processing systems, and more particularly relates to a microarchitecture for integrated circuit logic elements implementing arithmetic functions.
2. Art Background
A common method of improving speed of a computer system is to employ a math processor, separate from the main processor, for performing floating point mathematical calculations. The combination main processor and math processor provides greatly increased speed of system operation, since math processors are optimized for performing floating point mathematical calculations, and since the burden of performing such calculations is lifted from the main processor.
When designing integrated circuit hardware for implementing digital information processing operations, circuit designers generally seek to minimize layout area required to implement a particular digital function, while delivering the desired result as quickly as possible. Accordingly, circuit size and operational speed are of paramount concern in any digital circuit design. The foregoing is all the more crucial when designing hardware implementing arithmetic functions, principally because most mathematical functions require repetitive or iterative execution of operations to reach a desired result. In addition, floating point numbers have more bits than integer numbers and are comprised of several bit fields. Thus, similar operations on integer and floating point numbers are more time consuming for the floating point numbers.
Two commonly encountered hardware components in digital arithmetic circuit arrangements are regular carry-propagate adders (CPAs) and carry-save-adders (CSAs). CPAs are designed to receive two inputs for the datavalues to be added. The CPA further has one output, commonly denominated "sum". The CPA operates according to well known principles wherein addend bits of the same order are added together, and a carry bit transferred to the next higher order bit when required. The final sum is directly derived from a bit-by-bit addition with the appropriate carry to the next higher order bit, with a single bit carry out from the highest order bit position. The ripple carry of the CPA results in slow non parallel operation since higher order bits are dependent on low order bit results.
CSAs on the other hand have three inputs designed to receive three numbers to be added, and has two outputs, "sum" and "carry". In CSAs, carry bits are accumulated separately from the sum bits of any given order (position), the output of the CSA being two vectors, namely sum and carry, which when added together yield the final result. The benefit of a CSA is that higher order bits have no dependency on any lower order bit because all bit positions are calculated independently, thereby avoiding any propagation latency of carry bits as in regular adders. This enables addition of three numbers using only one time consuming CPA. Without a CSA, two CPAs would be required. Because of their speed and simplicity, CSAs are pervasively found in digital logic designs, although other adder designs are feasible and implemented when necessary to provide a desired function. However, such functionality may be achieved at the expense of a larger circuit layout area, slower circuit operational speed, and reduced margin in producing the output result.
In particular, it may be occasionally desirable to add more than two numbers in the same clock cycle. Alternatively, it may be desirable to add two numbers and also subtract a third number in the same clock cycle. Although the addition of three numbers can be accommodated by a prior art standard design of 3:2 CSA, subtraction of one number in combination with addition of two other numbers poses a more difficult problem. Principally, adders are commonly invoked, whereas subtraction circuits are rarely designed. Instead, the most common solution to implement subtraction is to invoke addition of a 2's complement datavalue, which may be accomplished in an adder circuit arrangement. As is generally known, a 2's complement representation of any binary value may be derived by inverting a given number to its 1's complement equivalent, and thereafter adding one. The 2's complement number may then be added to another number, thereby invoking the subtraction operation within an adder hardware implementation.
Hardware implementations to achieve subtraction in combination with the addition of more than two datavalues could be produced by extending the size and complexity adder circuitry. For example, one prior implementation employs a 3:2 CSA to add the three datavalues, and a carry propagate adder (CPA) coupled to combine sum and carry outputs of the 3:2 CSA. A "carry in" input to the CPA completes the 2's complement addition if one of the 3:2 CSA inputs is inverted. However, the associated increase in operational speed and size of the such extra circuitry would likely pose serious performance handicaps in high performance high frequency designs, especially when the sum need only be maintained in CSA form (as a sum and a carry vector).
Accordingly, and as will be explained in more detail in the following paragraphs, subtraction operations can be readily implemented in the particular case of the 3:2 CSA by postponing the addition of the constant "1" in the case of 2's compliment addition until after all bits have been added in the 3:2 CSA. By taking advantage of the least significant bit position in the carry output vector, a carry-in operation can be accomplished such that three datavalues may be presented to the input of the CSA without requiring one of the data inputs to be reserved to receive the constant "1" and without employing a subsequent CPA.