A. Field of the Invention
This invention relates to an arithmetic logic unit, which is a device utilized in the arithmetic section of contemporary data processing systems. These devices are used to implement the basic functions found in most computer repertoires, such as add, subtract, AND, OR, or Exclusive OR.
B. Prior Art
Originally, in carrying out the four basic arithmetic processes--addition, subtraction, multiplication and division--people made extensive use of tables of sums, differences and products that they had memorized. Reference to such tables, however, seriously slowed down the progress of a computation, equally so when it is carried out by a computer. Wired circuits that automatically provide the solution as output, given the operands as input, are, of course, much faster. A few older computers did, in fact, use tables.
In the case of addition, since the computer represents numbers to the base 2 it need only provide for three possible combinations of operands: 0 and 0, 0 and 1, and 1 and 1.
Usually, arithmetic processes, such as subtraction, are carried out by means of complements. For example, subtraction may be defined as the addition of two quantities after reversing the sign of one of them. In most computers, subtraction is carried out by adding the complement of the number to be subtracted. The complement of a number is obtained by subtracting it from the next highest power of the base. This is called 10's complementing if the base is 10. The complement can also be obtained by subtracting it from the next highest power of the base less 1. This is called 9's complementing if the base is 10. If the base if 2, these complements are respectively called 2's complements and 1s complements, respectively.
Thus, the 2's complement of 101.sub.2 =1000.sub.2 -101.sub.2 =011.sub.2 and the 1's complement of 101.sub.2 =111.sub.2 -101.sub.2 =010.sub.2. Note the practice of carrying the same number of digits in the complement as appeared in the original number. Also, since the only possible digits for a binary are 0 and 1 and since 1-1=0 and 1-0=1, the 1's complement of a binary number is obtained by merely reversing the digits.
Every attempt has been made, in past machines, to speed up the execution of arithmetical operations by multiplexing techniques. In at least one known machine, the arithmetic unit consisted of a parallel unit and a serial unit. The parallel unit performed floating point operations and the serial unit variable field length operations. The two units used the same two arithmetic registers. Namely, a double length accumulator register consisting of two 64 bit registers (A&B), and a double length operand register consisting of two 64 bit registers (C&D). The common use of these registers enabled the machine to shift between floating point and variable field length operations at any time.
Thus, the result obtained from a floating point operation could serve as a starting operand for a variable field length operation and vice versa. However, in spite of improvements, earlier arithmetic units were limited in size and capability in that they had fixed designs which could only accomplish arithmetic functions.
The arithmetic section of contemporary computers utilizer devices commonly referred to as Arithmetic Logic Units (ALU's ) to implement the basic functions found in most present day computer repertoires such as add, subtract, AND, OR, and exclusive OR. The devices are used as basic building blocks which are linked together to form adders of the desired width. Intergroup borrow or carries from the individual ALU devices, which are commonly available in 4-8 bit groups, are grouped and distributed back in a section referred to as the carry or borrow lookahead logic. While these devices perform most of the basic arithmetic and logical functions required, they do not allow for masked operations commonly found in many contemporary computers. These operations, such as masked compare and field substitute, require either additional hardware between the source data and the ALU or multiple passes through the ALU, using the basic primitive functions provided to implement the desired function. Additional hardware in the prime data path is undesirable since all operations are slowed down due to the additional delay imposed by this hardware. Multiple passes through the ALU also slows down these instructions requiring the additional sequencing.
A further weakness of these devices is that they are not optimized in their external borrow or carry structures from a system implementation point of view. In modern high-speed computers, interconnection of these devices makes up a large percentage of the total delay of the full adder. To optimize the adder design, it is important that the adder design place the greatest emphasis on the generation of group borrows or carries and group propagate signals. It is equally important to combine borrow or carry signals into a group to sum or difference output of the device. Most available devices perform respectably in borrow or carry generation but do not optimize borrow or carry in to data out.
Another weakness of these devices is the added delay imposed by the mode control lines which select the function the ALU is to perform. In a system which attempts to use the ALU at full capacity by cycling new data and new ALU modes on consecutive machine cycles, both data and control information arrive at the same time in the ALU. ALU delays are thereby controlled by the slowest overall path which normally is the control path.