1. Field of the Invention
The present invention relates to Hardware Description Language (HDL), especially to its addition of the capability of making digital designers able to code wave-pipelined circuits on a design-wide or chip wide scale in HDL.
2. Description of the Related Art
HDL refers to all current hardware description languages VHDL, Verilog, SystemVerilog and so on.
A synchronous digital system contains a lot of registers. Valid data flow through successive registers from system input registers to system output registers. All data flows are synchronous with triggering edges of a chip clock. For example, data flow from registers A to registers B, from registers B to registers C and so on in a successive order on the same clock cycle.
A path in a synchronous digital system is a route between any neighboring registers connected by combinational logic. If the target running frequency for a digital design is predetermined, the upper limit of propagating time for any paths is determined and has the inverse value of the target running frequency. A path is called a critical path if the time signals take to propagate through it is beyond the predetermined propagating time, and the time is called the path's critical time. If there are any critical paths, digital designers must spend time reducing all critical times by all means and eliminating all critical paths to meet the target running frequency.
Wave-pipelining is a technology which completes an operation that needs several clock cycles to propagate without intermediate registers and with input data acceptable on every clock cycle. For example, in a conventional pipelining operation, data flow from registers A to registers D through registers B and C to divide the critical path time into multiple smaller intervals to meet the critical time: A→B→C→D; with wave-pipelining, data flow through registers A and D without intermediate registers B and C. Absolutely, wave-pipelining will reduce logic resource usage and is superior to the conventional pipelining technology if it can be used.
FIG. 1 shows a prior art full picture of how wave-pipelining technology applies. There are input registers FFi and output registers FFo; data flow from the input registers FFi through combinational logic paths to the output registers FFo and signals take more than one clock cycles to propagate through it without any intermediate registers and with input data acceptable on every clock cycle. In the combinational logic block there are two special paths marked by Dmax and Dmin. Dmax is the longest path for signals to propagate from the input registers FFi to the output registers FFo while Dmin is the shortest path to do that.
FIG. 2 shows a prior art timing graph any wave-pipelined circuit must comply with if input data is acceptable on every clock cycle and earlier sent data will not be contaminated by later sent data.
Here are the most important inequalities involving wave-pipelining from paper “Wave-Pipelining: A Tutorial and Research Survey” by Wayne P. Burleson et al in IEEE Trans. Very Large Scale Integra. (VLSI) Syst., vol. 6, no. 3, pp. 464-474, September 1998.                Dmin and Dmax: The minimum and maximum propagation delays in the combinational logic block.        Tck: Clock-period.        Ts, Th: Register setup and hold times.        Dr: Propagation delay of a register.        Δ: Constructive known clock skew between the output and input registers.        Δck: Worst case uncontrolled clock skew at a register.        N: The number of clock cycles needed for a signal to propagate through the logic block before being latched by the output register.        Tl: The time at which the data should be clocked by the triggering edge of the output register N clock cycles after it has been clocked by the input register.        Tsx: The minimum time that node x must be stable to correctly propagate a signal through the gate.        dmin(x), dmax(x): the shortest and longest propagation delays from primary inputs to node x in the combinational logic block.        
Due to possible constructive skew A (of arbitrary value) between the output and the input registers:Tl=NTck+Δ.  (1)
The lower bound on Tl is given byTl>Dr+Dmax+Ts+Δck.  (2)
The upper bound on Tl is given byTl<Tck+Dr+Dmin−(Δck+Th).  (3)
Combining constraints (2) and (3) gives the well-known maximum rate pipelining condition of CottonTck>(Dmax−Dmin)+Ts+Th+2Δck.  (4)
Combining inequalities (1), (2) and (3) gives the following inequalityDr+Dmax+Ts+Δck<NTck+Δ<Tck+Dr+Dmin(Δck+Th).  (5)
To simplify the interpretation of the above relations two parameters Tmax and Tmin are introduced:Tmax=Dr+Dmax+Ts+Δck−A  (6)which represents the maximum delay through the logic, including clocking overhead and clock skews, whileTmin=Dr+Dmin−Δck−Th−Δ  (7)represents the minimum delay through the logic. With this, (5) can be expressed as follows:Tmax/N<Tck<Tmin/(N−1)  (8)
If, for a temperature above the nominal, Tmax and Tmin are increased by a factor βs>1 and for a temperature below the nominal, decreased by a factor βf<1 and the following inequality can be givenβs*Tmax/N<Tck<βf*Tmin/(N−1)  (9)
Inequality (9) may include other factors with new parameters βs and βf and still holds.
The following Internal node constraint must also be satisfied at each node x of the circuit:Tck>(dmax−dmin)+Tsx+Δck.  (10)
Currently many memory chip manufacturers successfully use wave-pipelining in their memory chip products with higher rate outputs, reduced power consumption and logic resources; and a few scientists use FPGA chips as a base to show some circuits can be done with wave-pipelining in isolated environments. Their works prove that the wave-pipelining is a very powerful tool to reduce power consumption and logic resources. Now there are three major existing obstacles preventing any ordinary digital designers from using the wave-pipelining in HDL:                Any workable wave-pipelined circuit must be guaranteed in any situations that earlier sent data will not be contaminated by later sent data. Currently there are no commercial synthesizers that are capable of doing that. Only circuit or synthesizer manufactures have the capability to accurately calculate point-to-point signal travel timings within a circuit to determine the data contamination problem.        The software algorithms making wave-pipelining successful, like Wong and Klass algorithms and others, have already been developed and matured, but ordinary digital designers have no means or resources to access to the technology, because there are no international HDL standards on how synthesizer manufacturers incorporate those capabilities into their products.        HDL needs the capabilities for digital designers to easily write wave-pipelining ready code for any number of critical paths on a design-wide or chip-wide scale instead of in an isolated environment and the written code can be identified, synthesized and used to generate wave-pipelined circuits by any synthesizer in ASIC or FPGA, and they should be part of HDL standards.        
What the present invention hopes to do is:                Invent a wave-pipelining coding system as new part of HDL standards for designers to write wave-pipelining ready code that includes:                    a) The code can be easily written in HDL to generate very complex wave-pipelined circuits.            b) The code can be identified, synthesized and used to generate wave-pipelined circuits by any synthesizer in ASIC or FPGA.                        Shift burdens of analyzing and manipulating wave-pipelining ready code, generating and implementing wave-pipelined circuits on a design-wide or chip-wide scale in HDL from individual designers to synthesizer manufacturers.        
If the coding system becomes new part of HDL standards all synthesizer manufactures will automatically be forced to implement all well-known wave-pipelining algorithms and techniques within their products, a competition will start for better implementations, making wave-pipelining technique available to every digital designer in HDL.
Here are some prior art definitions.                A path in a synchronous digital system is called a critical path if it meets the following three conditions:                    The path has input registers and output registers.            The input registers and output registers are connected by combinational logic without intermediate registers.            Signals take more than one clock cycle to propagate through the path under a designated target running frequency.                        A critical path may occur in two situations:                    When the combinational logic between the input and output registers is so complex that signals take more than one clock cycle to propagate through the path under a designated target running frequency.            When all intermediate registers among a conventional pipeline operation are removed and it is hoped to be implemented using wave-pipelining to save resources and reduce power consumptions.                        Traditionally the conventional wave-pipelining are mostly focused on the second situations in an isolated environment, but this invention pays attentions to both situations on a design-wide or chip-wide scale.        A path is called a feedback of a critical path if it meets two conditions:                    Input data to the input registers of the critical path partially comes from the middle of its combinational logic.            Signals take more than one clock cycle to propagate from a part of the input registers through the path to a part of the input registers under a designated target running frequency.                        