Computing pipelines have long been used in signal processing, general purpose computing devices and other digital computing applications. In a computing pipeline, information flows from one stage to another, primarily in one direction through the pipeline, and is processed in various ways at the various stages of the pipeline.
One early application of computing pipelines is for rendering computer graphic images. In this kind of pipeline, data representing the image is passed from the computer memory through a series of processing stages and ultimately appears on the computer display. Another kind of computing pipeline is commonly used for multiplication. Here the many additions of which the multiplication is composed are arranged as the stages of a pipeline. As a multiplicand passes through the pipeline, partial products are accumulated in each stage such that at the end of the pipeline a complete product has been formed. In these uses of pipelines in computers, data elements flow in only a single direction.
Reduced Instruction Set Computer (RISC) processors also use an internal pipeline to execute instructions. At the first stage of the pipeline, instructions are fetched from instruction memory. At subsequent stages they are variously decoded, executed, and their answers recorded. In such pipelines it is common to have "bypass" connections that connect the outputs of subsequent stages to auxiliary inputs of previous stages so that data calculated by earlier instructions may pass as soon as possible to later ones. Without bypass paths, all calculated data would have to be recorded in a register file before being accessible to subsequent instructions.
In such RISC processors, a multiplicity of bypass paths creates a major design problem. Because bypass is required from nearly every stage to nearly every previous stage, each stage has many inputs. Designing the control system for such a RISC computer is rendered difficult by the need to manage data flow on so many data paths. Because each stage must choose whether to take its input from its predecessor stage or from any of a number of bypass paths, the design of even a single stage becomes very complex.
Most computing devices in use today are synchronous and use an externally provided clock signal to step through its sequence of operations. Each action takes place only after arrival of the next clock event and all parts will act, if at all, at precisely the same intervals.
In asynchronous circuits each individual part acts independently whenever local conditions permits it to do so. Local logic detects when conditions are right and initiates the appropriate action. Each stage in an asynchronous pipeline sends data forward to the next stage without reference to any external clocking signal whenever the two stages agree that such a transaction is proper.
A bypass path structure in synchronous systems is undesirable for two reasons. First, as integrated circuits get larger, the delay in a long bypass path may become excessive and require a slower clock rate. Second, in a large integrated circuit, it is difficult to deliver identical clock signals to all parts of the pipeline. Differences in the timing of clock signals to different parts of the pipeline are called "clock skew". It may be difficult or impossible to accept data coming from a source remotely located in the pipeline whose clock is skewed with respect to nearby clock signals.
In an asynchronous system, bypass paths are very difficult to implement because remote sections of an asynchronous system operate at times entirely independent of the operation time of local information. Great care must be taken when moving data between widely separated stages in an asynchronous system. Failure to exercise adequate design care may permit occasional data elements to be damaged or lost and thus render the system unreliable. The difficulty of this task accounts in part for the very infrequent current use of asynchronous systems.
Finally, the presence of bypass structures requires careful control of when data actually move in the pipeline. In most systems in use today, if any stage is unable to move its data, it informs all other stages of its stall and they all wait. Because the stall signal may originate in any or all stages and must be delivered to all stages, it involves not only a logic function with many input terms, but also a lengthy of communication path, both contributors to delay. Thus the stall signal itself may be a pacing item in the system.
Asynchronous pipelines are rare largely because designers have considered them too difficult to design. Some asynchronous pipelines are now used in First In First Out buffer memories, mainly in signal processing and input/output applications. A particularly simple form of asynchronous pipeline was described by Sutherland in U.S. Pat. Nos. 5,187,800, and 4,837,740 and in the publication called Micropipelines. In the asynchronous pipeline devices that have heretofore been built, information flows in only a single direction, or if information is to flow in more than one direction, entirely separate mechanisms are used for the separate directions. Such structures are merely compound use of the one directional pipeline.
The design of a RISC computer with many bypass paths in the asynchronous design style has heretofore proven beyond the capability of designers. The few asynchronous RISC's that have heretofore been designed (Caltech, Manchester) have avoided bypass paths and thus suffered unnecessary delay and performance degradation.