1. Field of the Invention
This invention is directed to a stack, and in particular, to a self-timed implementation of a stack.
2. Background of the Related Art
Self-timed or asynchronous methodologies use functional units having an asynchronous interface protocol to pass data and control information. By coupling such asynchronous functional units together to form larger blocks, increasingly complex functions can be realized. FIG. 1 shows two such functional units coupled via data lines and control lines. A first functional unit 100 is a sender, which passes data. The second functional unit 102 is a receiver, which receives the data.
Communication between the functional units 100, 102 is achieved by using data wires 104 and control wires. A request control wire REQ is controlled by the sender 100 and is activated when the sender 100 has placed valid data on the data wires 104. An acknowledge control wire ACK is controlled by the receiver 102 and is activated when the receiver 102 has consumed the data that was placed on the data wires 104. This asynchronous interface protocol is known as a "handshake" because the sender 100 and the receiver 102 both communicate with each other to pass the data on the data wires 104.
The asynchronous interface protocol shown in FIG. 1 can use various timing protocols for data communication. One related art protocol is based on a 4-phase control communication scheme. FIG. 2 shows a timing diagram for the 4-phase control communication scheme.
As shown in FIG. 2, the sender 100 indicates that the data on the data wires 104 is valid by setting the request control wire REQ high (active). The receiver 102 can now use the data as required. When the receiver 102 no longer requires the data, it signals back to the sender 100 the acknowledge control wire ACK high (active). The sender 100 can now remove the data from a communication bus such as the data wires 104 and prepare the next communication.
In the 4-phase protocol, the control lines must be returned to the initial state. Accordingly, the sender 100 deactivates the output request by returning the request control wire REQ low (inactive). On the deactivation of the request control wire REQ, the receiver 102 can deactivate the acknowledge control wire ACK low (inactive) to indicate to the sender 100 that the receiver 102 is ready for more data. The sender 100 and the receiver 102 follow this strict ordering of events to communicate in the 4-phase control communication scheme. Beneficially however, there is no upper bound on the delays between consecutive events.
A first-in first-out (IFO) register or pipeline provides an example of self-timed systems that couple together a number of functional units. FIG. 3 shows such a self-timed FIFO structure. The functional units can be registers 300a-300c with both an input interface protocol and an output interface protocol. When empty, each of the registers 300a-300c can receive data via an input interface 302 for storage. Once the data is stored in the registers 300a-300c, the input interface cannot accept more data. In this condition, input for the registers 300a-300c has "stalled". Each of the registers 300a-300c remains stalled until the registers 300a-300c is again empty. However, once the register 300a contains data, the register 300a can pass the data to the next stage (i.e., register) of the self-timed FIFO structure via an output interface 304. The register 300a generates an output request when the data to be output is valid. Once the data has been consumed and the data is no longer required, the register 300a is then in the empty state. Accordingly, the register 300a can again receive data using the input interface protocol.
Chaining the registers 300a-300c together by coupling the output interface 304 to the input interface 302 forms a multiple stage FIFO or pipeline. Thus, output interface request and acknowledge signals, Rout and Aout, are respectfully coupled to the following register 300a-300c (stage) input interface request and acknowledge signals, Rin and Ain. As shown in FIG. 3, data signals Din, Dout passed into a FIFO input 306 will be passed from register 300a to register 300c to eventually emerge at a FIFO output 308. Thus, data ordering is preserved as the data is sequentially passed along the FIFO. The FIFO structure shown in FIG. 3 can use the 4-phase control communication scheme shown in FIG. 2 as the input and output interface protocol.
Related art self-timed circuits can be grouped according to gate propagation delays and wiring delays. Such groupings or classifications include bounded delay, delay insensitive, and speed independent circuits.
In a bounded delay model, gate delays and wire delays are assumed finite and quantifiable. The bounded delay model can be used in synchronous systems. Data from the sender in the bounded delay model can be transferred to the receiver after an appropriate delay.
In a delay-insensitive model, gate delays and wire delays are assumed to be unbounded. Data from the sender must be consumed and acknowledged by the receiver before new data can be transferred.
In a speed-independent delay model, gate delays are assumed to be unbounded but wire delays are assumed to be negligible. Control circuits using the speed-independent delay model can be less complex because data sent by the sender to a plurality of receivers need only be acknowledged by a single receiver of the plurality of receivers.
A stack operates as a last in first out (LIFO) device. For example, a stack can be used as a memory device. The last value input to the stack in a write operation is the first value to be output during the next read operation. In a VLSI system, for example, stacks are used in various ways such as in evaluating expressions, managing return addresses for subroutine calls, storage of local variables for subroutines, and passing parameters for subroutine calls.
Examples of related art stacks include a register based stack, a latch based stack and a RAM based stack. In the register based stack, data values are stored in a shift register. Placing data in the stack is called a "push" and requires new data to be shifted into the shift register in one direction. Removing the data from the stack is called a "pop" and shifts the data in the opposite direction out of the shift register. Each "push" and "pop" operation requires every data value stored in the shift register to move one location. Thus, register based stacks can operate as a two-way shift register. Shift registers have a simple control structure. However, since every data value moves with each push or pop operation, a control signal must be broadcast to all storage elements in the register. Accordingly, complex D type latches are used. Further, power inefficiencies occur as data propagates the entire length of the shift register for each of the push or pop operations.
In the latch based stack, data values are stored in a simple latch rather than complex D type latches. Unlike the register based stack design, communication between adjacent latches is local in the latch based stack. Each latch communicates with adjacent neighbors to transfer new data into the stack (push) or read the data from the stack (pop). Latch based stacks can operate as a two way ripple register. The latch based stack has advantages when the stack is not full or empty. In other words, pushing the data onto an empty stack results in the transfer of data using only a single latch. Thus, an advantage of the latch based stack is a reduction in the number of data transfers between latches when the stack is not full. However, disadvantages of the latch based stack include a variable response time based on how many data items are in the stack during a given operation. The number of data transfers between adjacent latches increases as the latch based stack fills. Preferably, a response time is constant for push or pop operations regardless of the amount of data in the stack.
In the RAM based stack, the address space of the RAM is considered to be linear. Thus, a stack can be implemented by adding a pointer to indicate the correct address for the next write (push) or read (pop) operation. For example, an address pointer can be incremented or decremented for each push or pop instruction, respectively. Thus, the RAM based stack can be used as an array implementing a pointer. The RAM based stack is efficient for large stacks. Further, in the RAM based stack, management of the full and empty conditions is complex.
Synchronous implementations for stacks trade off performance for power consumption or ease of implementation. In addition, the time required to perform a push or pop operation is depends on the number of data values stored in the stack. A need exists for a stack that exhibits low power and a constant response time.