This specification describes communicating sequential processes (CSP) which are implemented as quasi delay insensitive asynchronous circuits. More specifically the present specification teaches reshuffling communication sequences and combining computation with buffering to produce pipelined circuits.
Asynchronous processors are known as described in U.S. Pat. No. 5,752,050. These processors process an information stream without a global clock synchronizing the operation.
An asynchronous processor pipeline scheme uses the basic layout shown in FIG. 1. A first process 100 communicates with a second process 110 that in turn sends a message to the next process. The messages use a four phase handshake. In the first phase, the sender raises the request line. In the second phase, the receiver raises the acknowledge line. In the third phase, the sender lowers the request line. In the fourth phase, the receiver lowers the acknowledge line. In the handshaking expansion language (HSE), the handshake on channel X is described as X+; Xa+; X−; Xa−. In FIG. 1, the request between 100 and 110 is the L wire (102). The acknowledge for that communication is La (108).
The request between 110 and 120 is the R wire (104), and the acknowledge is Ra (106).
This is a basic request, acknowledge system. The request [L] is acknowledged (La), then acted on R↑, then acknowledged again (Ra).
Pipelined asynchronous circuits are known as “Bundled-Data” or “Micropipelines” and have a synchronous style data path which is “clocked” by asynchronous self-timed control elements. These control elements handshake between pipeline stages with a request/acknowledge pair. The delay of the datapath logic is estimated with a delay-element in the control, so that the request to the next pipeline state is not made until the data is assumed to be valid.
The alternative style involves (quasi) delay-insensitive circuits, for which no delay assumptions are made. In this style, the prior art is embodied in the Caltech Asynchronous Microprocessor patent. Datapaths are still separated from control, as in the bundled-data case, but completion detection circuitry is added instead of delay lines to detect when the data is valid. Communication between processes occurs via delay-insensitive channels with a 4 phase handshake. In between latches or buffers, logic can be performed by unpipelined weak-condition logic blocks.