1. Field of the Invention
This invention relates to buffer storage systems and more particularly to pipelined buffer storage systems and methods of operation thereof.
2. Description of Related Art
In many networked applications, it is necessary to buffer large amounts of data in a pipeline buffer. A pipelined buffer includes “N” hardware stages, where N is a positive integer, broken down into a series of memory registers connected in series for storing a sequence (line) of data bits temporarily passed from one register to the next, as in a bucket brigade. That is to say that data bits are moved in series from one memory register to the next register in the pipeline, one by one, with each register being replenished by the next data bit in line. Since the “N” stages operate substantially concurrently, a pipeline can operate faster than a non-pipelined system. A pipeline buffer is a First-In-First-Out (FIFO) buffer in which one “word” of data is written into the buffer and one “word” of data is read from the buffer on each clock cycle. The number of “words” stored in the buffer is a fixed value equal to the depth of the pipeline. For example, if the depth of the pipeline is N, the word written into the buffer at cycle “i” is read from the buffer substantially later at cycle “i+N”. The word read from the buffer at cycle “i” was written into the buffer at cycle “i−N”.
Referring to FIG. 1, a schematic circuit diagram is shown of a common, prior art pipeline buffer 7, which includes a set of N multi-bit registers REG1 10, REG2 11, REG3 12, . . . REGN−1 13, and REGN 14, where N is a positive integer equal to the number of registers therein. Input data DATAIN[i] is submitted to the pipeline buffer 7 on input bus lines 8. The pipeline buffer 7 is configured as sets of multi-bit shift registers, where data is shifted from one of the sets of multi-bit registers 10-14 to the next set of multi-bit registers on each clock cycle, as shown in FIG. 1.
Each bit of each of the multi-bit registers 10-14 of FIG. 1 is typically implemented as a flip-flop. Since two latches are required to form each flip-flop, that means that a substantial area is required for each flip-flop. The same clock signal CL1 on line 9 clocks all of the multi-bit registers 10-14 in the pipeline buffer 7. Output data (DATAOUT[i]) is delivered from the last multi-bit register 14 on bus lines 15 simultaneously. It is noted that the data out from pipeline buffer 7 on bus lines 15 is defined by the relationship DATAOUT[i]=DATAIN[i−N].
There are two problems with the typical implementation of the type of pipeline buffer array 7 shown in FIG. 1, which are as follows:
1. Clock Skew Problems
First, in an Application Specific Integrated Circuit (ASIC) design environment, where flip-flop cells (e.g. registers 10-14) are automatically placed and routed, it is difficult to manage clock skew to avoid fast path (early mode) failures without adding a significant amount of delay into each register-to-register path.
2. Delays Caused by Excessive Chip Area Requirements
Secondly, added delays typically contribute to the second problem, which is that the chip area required by such a pipeline buffer array 7 can become quite large.
FIG. 2 shows a schematic circuit diagram of alternative prior art FIFO (First In First Out) buffer configuration 17 which utilizes a two-port memory array 23, consisting of a write port 24 for writing data to a selected write address and a read port 25 for reading data from a selected read address. A FIFO buffer typically also includes address counters and address comparison logic to detect when there is data in the FIFO buffer (read and write addresses are not equal) vs. when the FIFO buffer is “empty” (read and write addresses are equal).
An clock pulse 19 is applied to a register 20 which supplies an output on line 21 to node 21′, which goes to incrementer 22 which adds a plus one (+1) to the register 20. The value on node 21′ from register 20 passes through line 21 to the write address input to the memory array 23 and via the −N subtractor 25 through line 25′ to the read address input to memory array 23. The input data (DATAIN[i]) is submitted on bus lines 18 to the data input of memory array 23 and data out (DATAOUT[i]) from the two-port memory array 23 is delivered to output bus lines 23′ from the two-port memory array 23.
FIG. 2 shows a FIFO buffer that can easily be tailored to implement a pipeline buffer array 23. In the case of FIG. 2, a problem that needs to be solved is illustrated which is that the read and write addresses are always at a fixed difference, N, from each other (modulo M, where M is the number of words in the array).
The memory array implementation of FIG. 2 would solve the first problem, i.e. the clock skew problem, because of its regular, predetermined and pre-characterized layout. However, because of the overhead of address decoding logic and testability, the memory array implementation is larger than the flip-flop implementation unless the number of stages, N, in the pipeline is large (≧16), which is a problem in that excessive area on the chip is required.
Because typical applications require fewer than a dozen pipeline stages, an alternative implementation is required.