It has been shown that asynchronous circuits can improve the throughput of a circuit, and can be more robust to process variability and environmental changes. This can potentially allow designers to use asynchronous circuits in ASIC design flows. The omission of the clock network together with the fact that asynchronous circuits can be active only when they are performing useful functions, can inherently contribute to the reduction of switching activity, and hence power saving. These benefits, however, come at the expense of incorporating handshaking signals, completion detection trees, distributed controllers, and timing assumptions. The extra overhead might lead to a circuit with more area and higher power consumption compared to synchronous implementation.
Therefore, designers of low power asynchronous circuits typically endeavor to carefully avoid intensive overhead to be able to compete with the equivalent synchronous implementation.
Because of the more complicated structure of asynchronous circuits, they have not been adopted by commercial computer-aided design (“CAD”) tool developer companies as much as synchronous circuits have been. Thus, a circuit designer does not have a wide range of options when it comes to design automation of asynchronous circuits.
This has motivated many asynchronous designers to exploit synchronous CAD tools for synthesizing asynchronous circuits. There are multiple instances in the literature that designers tried to use a familiar synchronous design flow for an asynchronous flow and feel the gaps with rather simple ad-hoc algorithms in order to build up an asynchronous circuit design flow. Often, the original legacy circuit is described at a synchronous register transfer level (“RTL”) level as a netlist, or interconnection or interconnectivity of primitive circuit elements or electronic design. Netlists usually convey connectivity information and at a basic level provide nothing more than instances, nets, and perhaps some attributes.
Various approaches exist for starting with a synchronous netlist to produce an asynchronous netlist. The following are significant examples of such approaches:
A De-synchronization approach has been used, as described by J. Cortadell, et al. “Desynchronization: Synthesis of Asynchronous Circuits From Synchronous Specifications,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on. Volume 25, Issue 10, pp. 1904-1921 (October 2006). In this method, each flip-flop is converted into two latches: an odd and an even latch. The clock tree is then replaced by a set of handshaking signals. Asynchronous local controllers are added to the netlist to enable the latches and control the flow of data so that the flow of data in the asynchronous netlist is equivalent to the flow of data in the asynchronous netlist.
A phased logic approach is described in D. H. Linder, et al. “Phased logic: supporting the synchronous design paradigm with delay-insensitive circuitry,” Mississippi State Univ., IEEE Transactions, vol. 45, issue 9, pp. 1031-1044 (September 1996). In this method the modules in the synchronous netlist are replaced by equivalent phased logic modules. In phased logic, each signal is encoded with two Level Encoded Dual Rail (“LEDR”) signals. After the original conversion, the liveness and safeness problems are analyzed and extra buffers and token-buffers are added if necessary. Although some FPGA implementations of this technique have been reported, in general custom LEDR library development is needed.
A null convention logic approach is described in Karl M. Fant, et al. “NULL Convention Logic” (Theseus Logic, Inc.), and available at http://www.cs.ucsc.edu/˜sbrandt/papers/NCL2.pdf. This method starts from conventional HDL. It then gets synthesized into an intermediate library called 3NCL. This library is still a single-rail library but with the addition of an extra possible value (the NULL value) for all wires. This preserves single-rail simulation and design capabilities, while emulating the final dual-rail gates. The final library is a full dual-rail library. Next, second run of synthesis is performed to translate the 3NCL gates into 2NCL gates that are the true dual-rail gates that will be used for the physical design process. In order to assure DI behavior only a limited variety of gates are used (2-input NAND, NOR, XOR).
Another approach is described in A. Smirnov, et al. “Synthesizing Asynchronous Micropipelines with Design Compiler,” Proc. SNUG Boston 2006: Synopsys User Group, Sep. 18-19, 2006, Boston, USA. In this method, a synchronous circuit described at RTL level is implemented as an asynchronous micropipeline. Synthesis can be targeted at a wide range of micropipeline protocols and implementations through standard cell library approach. Primary target applications include high-throughput low-power using domino-like low-latency cells.
A dataflow graph approach is described in International Patent Application No. PCT/US2007/067618 (Publication No. WO/2007/127914) and entitled “Systems And Methods For Performing Automated Conversion Of Representations Of Synchronous Circuit Designs To And From Representations Of Asynchronous Circuit Designs” having Applicant Achronix Semiconductor Corp. and inventor R. Manohar. In this method a synchronous netlist containing combinational logic, latches, and flip flops with multiple clock domains and enable signals is converted to asynchronous circuit using a notion of dataflow graph. This method eliminates the gating through substitution of a MUX transformation and using the gating information to make the output of the state-holding element a conditional signal. In such a method, if the state holding element in synchronous circuit is gated, either the gating is eliminated using a MUX, or the previous token will be generated using an asynchronous register module. Hence, the computational modules will be activated and consume a token whose value is the same as the previous token.
Another approach is described in U.S. Provisional Patent Application Ser. No. 61/047,714, filed 24 Apr. 2008 and entitled “Clustering and Fanout Optimizations of Asynchronous Circuits” to G. Dimou (and assigned to the assignee of the present disclosure), the entire contents of which are incorporated herein by reference.
For such an approach, a synchronous netlist of combinational gates and flip-flops can be converted to asynchronous templates, such as a pre-charged half-buffer (“PCHB”), e.g., as described in “Pipelined Asynchronous Circuits” by Lines, Andrew Matthew (1998), Technical Report, California Institute of Technology, [CaltechCSTR:1998.cs-tr-95-21]. In such an approach, the netlist is first clustered into several gates that can use a shared controller, subject to a given cycle time constrain. The cluster size is limited by the number of inputs and output. After clustering, the tool tries to optimize the throughput of the circuit through slack matching and minimize the area.