The present invention relates generally to computer-aided design systems and more particularly to a method of automatically designing pipelined stages by dividing a combinational circuit into parts in a computer-aided design system.
Computer-aided design ("CAD") systems have become increasingly sophisticated and have automated many aspects of the design of complex machines. One type of complex machine that can be designed with the aid of a CAD system is an electronic device such as a computer. A CAD system cannot design an entire computer but it can be of tremendous value to a human computer designer. One way that a CAD system can assist the designer is by automatically generating a netlist for an overall circuit that the designer has created. A "netlist" is a detailed description of a combination of elementary electronic circuit elements that make up such an overall circuit. For example, a netlist may specify a logic AND gate having an output connected to first input of a logic OR gate, and so on. A netlist may include many thousands of circuit elements and all the interconnections therebetween.
Existing CAD systems can assist a computer system designer not only by generating netlists but also by automating certain of the tasks involved in designing some kinds of logic circuits. For example, a CAD system that can modify a design of an adder in response to a request from a designer is described in U.S. patent application Ser. No. 08/031,775, filed Mar. 15, 1993 and owned by the same assignee as the present application, the contents of which are incorporated herein by this reference.
An approach to computer architecture that is becoming of greater importance is pipelining. Pipelining may be described as a technique of breaking a sequential process into several subprocesses and executing the various subprocesses concurrently. A simple example of a portion of a computer that implements this technique is shown in FIG. 1. Data is received at an input port 11 and latched into a first latch 13 upon the occurrence of a clock pulse. Once the data is latched into the latch 13, it is provided to a first stage 15. This first stage 15 typically comprises a combinational circuit such as an adder or most any other type of logic circuit that is desired. The output of the combinational circuit 15 is latched into a second latch 17 on the next clock pulse and is thereupon provided to a second stage 19 which is also a combinational circuit. The logic of the second stage may or may not be similar to that of the first. The output of the second stage is in turn latched into a third latch 21 on the next clock pulse and provided to a third stage 23. The third stage provides its output at a data output 25.
From the foregoing description it will be apparent that each stage performs its task concurrently with the others, but with different inputs. The stages of a pipeline may be compared to a row of workers on an automobile assembly line. Each worker performing a different task. All the workers perform their tasks concurrently, but each works on a different car at any one time. When each worker has performed his/her task on one car, all the cars are advanced to the next stage on the assembly line.
An example of a task that a pipelined computer can perform much faster than a simple sequential computer is the task of adding two floating-point numbers. A floating-point number is a number that is expressed in the form A.times.10.sup.B, where A (the mantissa) is a decimal fraction between zero and one and B (the exponent) is an integer. The task of adding two floating-point numbers requires three steps: align the mantissas, sum the mantissas, and normalize the result. In a sequential computer, each of these steps must be performed separately. If each step takes one unit of time, the computer will need three units of time to add the two numbers. In a complicated scientific calculation there may be thousands of such additions to be performed. The time required to perform all these additions could be reduced by a factor of three if the computer could perform all three steps in a single unit of time.
It is not possible to perform all three steps of one addition simultaneously, because each step after the first requires the output of the preceding step. However, by pipelining, the steps of a series of additions can be overlapped. Thus, the second step of one addition can be performed concurrently with the first step of the next following addition, and so on. With reference to FIG. 1, this is done by designing the first combinational circuit 15 as a mantissa aligner, the second combinational circuit 19 as a mantissa adder, and the third combinational circuit 23 as a result normalizer. The first two floating-point numbers to be added, say X.sub.1 =0.95.times.10.sup.3 and Y.sub.1 =0.82.times.10.sup.2, are latched into the first latch 13 and presented to the mantissa aligner. The mantisssa aligner converts Y.sub.1 to the form Y.sub.1 '=0.082.times.103 and presents both numbers to the second latch 17. On the next clock pulse, X.sub.1 and Y.sub.1 ' are presented to the mantissa adder and simultaneously the second two numbers to be added, X.sub.2 and Y.sub.2, are presented to the mantissa aligner. While the mantissa adder is adding X.sub.1 and Y.sub.1 ' to get 1.032.times.10.sup.3, the mantissa aligner is aligning X.sub.2 and Y.sub.2. On the next clock pulse, the result from the mantissa adder is latched through the latch 21 to the result normalizer; meanwhile, the aligned X.sub.2 and Y.sub.2 are presented to the mantissa adder and the third two numbers to be added, X.sub.3 and Y.sub.3, are presented to the mantissa aligner. The result normalizer converts 1.32.times.10.sup.3 to 0.132.times.10.sup.4 ; simultaneously, the mantissa adder adds X.sub.2 and Y.sub.2 while the mantissa aligner aligns X.sub.3 and Y.sub.3. Thus, once three numbers are in the pipeline, a new result is produced every unit of time.
More information on computer pipelining may be found in such reference texts as Hennessy & Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann Pub., 1990, ch. 6; Stone (ed.), Introduction to Computer Architecture (2d Ed.), SRA Inc., 1980, ch. 9; and Mano, Computer System Architecture (2d Ed.), Prentice-Hall, 1982, pp. 277 et seq.
From the foregoing it will be apparent that many kinds of repetitive computational tasks can be executed faster in a pipelined computer than in a simple sequential one. It will also be apparent that the combinational circuits which make up the various stages of a pipeline sometimes must be specially designed for a specific task or for a group of related tasks. Thus, in designing a pipelined computer, the designer must design one or several pipelines for those tasks which can best be performed in a pipelined system. Which tasks should be performed in a pipelined system, and which stages the pipeline should have, are factors that in general will be decided by the designer so as to best satisfy whatever design specifications the designer has created (or has been given).
A task that a computer designer must often perform is to divide a pipeline stage in two. To do this requires calculating signal processing times at many points in the logic circuitry that makes up the stage, identifying those points at which the circuit can be divided without getting the various signals out of sync with each other, and determining at which points to make the division according to how much processing time is desired in each of the new stages into which the existing stage is to be divided. A CAD system that could perform this task automatically would be of great value to computer system designers.