The present invention covers a bus control apparatus arrangement forming part of a floating-point unit within a microprocessor. The microprocessor utilized with the present invention is the Intel 860.TM. Microprocessor, frequently referred to as the N10.TM. processor. (Intel is a registered trademark of Intel Corporation).
The N10 processor is a 32/64-bit IEEE compatible floating-point processor, a 32-bit RISC integer processor and a 64-bit 3-dimensional graphics processor. Using a numerics processor optimized for both vector and scalar operations, it represents the industry's first integrated high performance vector processor incorporating over one million transistors and providing about 1/2 the performance of the Cray1, all on a single chip. The N10 processor uses pipelined floating-point units to achieve extremely fast execution rates.
As will be seen, the present invention provides a highly optimized bus control apparatus for the floating-point hardware of the N10 processor. This bus control apparatus supports simultaneous (dual) operation of a multiplier and an adder unit. These dual-operations support the most commonly used software algorithms such as sum of products, DAXPY, FFT, etc.
Normally, in the instructions of a microprocessor, the source operands and destinations are specified from a set of floating-point registers. In most systems, this set of floating-point registers usually supplies two source operands and one destination operand. The three operand arrangement is sufficient for doing simple add or multiply operations. However, to perform dual-operations such as an add or multiply simultaneously, three more operands (for a total of six) need to be supplied. Because it is very inefficient to require the floating-point register file to deal with six operands, prior art microprocessors typically perform dual-operations serially. In other words, an add is first performed followed by a multiply, or vice versa.
An alternative to serial operation is to perform both multiply and add operations in parallel. One widely adopted approach is known as a multiply cumulate operation. In the multiply cumulate operation, the multiplier gets two source operands from the floating-point register file. One of the operand inputs to the adder receives the result output of the multiplier. The other source operand input to the adder is coupled to the result output of the adder itself, in a sort of feedback arrangement. The arithmetic operation simulated is basically a cumulation of the sum of products. The chief drawback of the multiply cumulate operation is that it is only capable of implementing a simple kind of operation, i.e., a sum of products. This is because the interconnects are generally "hard-wired" in a fixed arrangement. Because of the desire to implement a variety of algorithms, what is needed is an apparatus which is substantially more generalized and can handle a broader range of operations. It would be advantageous to have a bus control apparatus that could implement complex algorithms in a much more efficient manner. As will be seen, the present invention permits a broad range of parallel operations, or algorithms, to be executed in an efficient manner. This capability enhances the presently described microprocessor when compared to prior art processors.