First-In-First-Out (FIFO) memory queues are useful in numerous applications. They allow the storage and retrieval of data upon user request using a standard read/write interface, as well as crossing between clock domains (in the case of an asynchronous FIFO). FIFOs are typically constructed out of dedicated memory, such as blockRAM (BRAM), in an integrated circuit. However, the blockRAM clock-to-out times are typically on the order of 2.0 ns or larger, even in the fastest speed grades of the newest architectures. This clock-to-out time is five or more times greater that the clock-to-out time of a register, also commonly called a flip-flop, which is approximately 0.5 ns.
In high-speed designs, the slow clock-to-out times of the blockRAM can be a critical path within the design. Therefore, it is common practice to register the output of the blockRAM immediately, prior to performing any operation on the data. While this approach is beneficial, it has significant limitations. Registering the data prior to using it adds an additional cycle of latency to the FIFO.
A block diagram for a conventional FIFO 100 is shown in FIG. 1. In particular, a blockRAM 102 is coupled to write logic 104 and read logic 106. The blockRAM receives a read enable (RdEn) signal and outputs data to a register 108. The read logic 106 outputs a RdDataValid signal to a register 110 and outputs a registered value RdDataValid_r. However, a single read request in the conventional circuit of FIG. 1 takes two clock cycles before the data is available to the user. As shown in FIG. 2, a RdEn requests a read from the FIFO, and RdValid indicates that new data (RdData) is available on the data bus on the following cycle. RdValid_r and RdData_r, which are the registered version of RdValid and RdData, are available to the user two clock cycles following the read request.
This two-cycle latency in reading data is undesirable for a number of reasons. The read enable to the FIFO is most likely a critical path in the design. In addition to the clock-to-out time, the setup time to the blockRAM is also large when compared to normal register setup times. BlockRAM setup for read enable time is approximately 1 ns, while register setup time for read enable is approximately 0.2 ns. If the next read request is dependent upon the data read out of the FIFO, the large clock-to-out and setup requirements severely limits this type of read request. Although one common solution is to always enable the blockRAM for reading, and control the address pointers into the FIFO, there is still a two-cycle latency from read request to registered and valid data.
Zero-cycle latency FIFO memory queues are useful in numerous applications. They allow the storage and retrieval of data the same cycle as it is requested using a standard push/pop interface, as well as crossing between clock domains. Additionally, they provide same cycle turn-around when requesting data, allowing for efficient read throttling based on the contents of the data (i.e. read a data word, investigate the contents, and decide to read again or not at all in a single clock cycle).
Traditionally, zero-cycle latency FIFOs are implemented in asynchronous read RAMs, which can directly provide the data on the same cycle it is requested. However, fully synchronous devices do not support this type of operation. Additionally, if a design is created which allows the blockRAM to be used in a zero-cycle latency manner, the clock-to-out time of the blockRAM are relatively slow (on the order of 2.0 ns or larger) compared to register clock-to-out times (approximately 0.5 ns). The performance of a zero-cycle latency FIFO driven directly from blockRAMs can be limited by the blockRAM clock to out times. A common solution to this problem is to register the output in registers directly following the blockRAM, but this changes the zero-cycle latency FIFO to a one-cycle latency FIFO. This is not desirable if the current data is used to throttle a read for the next cycle.
Accordingly, there is a need for an integrated circuit and method of reading data from a memory device which reduces the clock-to-out time and setup time of a blockRAM.