1. Field of the Invention
The invention relates to the field of digital communications, and more particularly, to an elastic buffer compatible with various communications protocols.
2. Related Art
In a serial transceiver (transmitter-receiver), separate clocks control the rates of data being received by and read from the receiver. Although nominally running at the same frequency, the recovered clock (derived from the received data) and the read clock typically differ in frequency by up to 200 ppm (parts per million). To accommodate this asynchronous behavior, the receiver in a serial transceiver frequently includes an “elastic buffer.” An elastic buffer is a modified FIFO (first in first out) memory that compensates for frequency discrepancies between the recovered clock and the read clock in a serial transceiver. During normal operation, data is continuously written to and read from the elastic buffer by the recovered clock (i.e., the write clock of the elastic buffer) and read clock, respectively.
FIG. 1a shows a conventional elastic buffer 100, which comprises a controller 110 and a memory array 120. Memory array 120 includes a plurality of equally sized memory locations A00–A63, each holding one “block” of data (i.e., the smallest grouping of data bits in a data stream—a byte, for example). Controller 110 receives an input data stream Din, a write (recovered) clock signal Wclk, and a read clock signal Rclk. Input data stream Din is formed by a series of data blocks, wherein the width of each data block is dependent on the particular communications protocol being used. When write clock signal Wclk is driven HIGH, controller 110 generates a write address Waddr corresponding to one of memory locations A00–A63, and the first available data block of input data stream Din is written into that memory location. The next time write clock signal Wclk is driven HIGH, controller 110 increments write address Waddr, so that consecutive data blocks from input data stream Din are stored in contiguous (i.e., sequential) memory locations in memory array 120. Note that in a FIFO such as elastic buffer 100, the last memory location (A63) in memory array 120 is treated as being contiguous with the first memory location (A00). Therefore, after a data block is written to memory location A63, the next write operation is to memory location A00. Note that while a memory location is contiguous with the memory locations that immediately precede and immediately follow it, the memory location is not contiguous with itself.
In a similar manner, during read operations to memory array 120, a positive edge on read clock signal Rclk causes controller 110 to generate a read address Raddr that corresponds to one of memory locations A00–A63, and the data block stored in that memory location is read out. During normal read operations, each positive edge on read clock signal Rclk increments read address Raddr so that the contents of contiguous memory locations are read as an output data stream Dout. As noted previously, memory location A63 is treated as being contiguous with memory location A00.
A “half full” configuration (i.e., half of memory locations A00–A63 are storing buffered data not yet read) gives elastic buffer 100 the greatest cushion for variations in write and read clock rates. However, because write clock signal Wclk and read clock signal Rclk typically run at different frequencies, this half full configuration cannot be sustained indefinitely. For example, if write clock wclk is faster than read clock Rclk, memory array 120 will “fill up” with unread data blocks. Eventually, new data blocks from input data stream Din will overwrite already-stored data blocks in memory array 120 before those older data blocks can be read out (overflow state). If write clock Wclk is slower than read clock Rclk, memory array 120 will be “emptied” as read operations take place faster than new data blocks can be stored. In such a case, a data block in a particular memory location may be read out multiple times before a new data block is stored in that memory location (underflow state).
To compensate for these read and write clock frequency discrepancies, elastic buffer 100 can execute “clock correction” operations (sometimes referred to as “rate matching” operations), in which special data blocks originally included in input data stream Din are omitted from output data stream Dout, or else special data blocks not originally present in input data stream Din are added to output data stream Dout. A “correction sequence” can be defined as the smallest set of data blocks, or “correction blocks,” that may be omitted or added for clock correction operations. The correction (omission or addition) takes place at a location within the data stream where the correction sequence is (or was) present. The particular communications protocol being used defines the length (number of blocks, greater than or equal to one) of the correction sequence, as well as the values of the blocks in the correction sequence. Correction sequences are typically present at various locations within input data stream Din. The associated communications protocol is designed to ignore the presence or absence of such correction sequences during processing of a data stream. Therefore, controller 110 can monitor input data stream Din for these correction sequences and use them to execute clock correction operations when memory array 120 approaches an overflow or underflow state, without affecting the information being carried by data stream Din (or Dout).
As an example, assume that input data stream Din is made up of consecutive data blocks d0, d1, X, d3, d4, d5, d6, X, d8, etc., where X is a correction sequence consisting of a single correction block. At each positive edge on write clock signal Wclk, a data block is written into memory array 120—e.g., data block d0 is written into memory location A00, data block d1 is written into memory location A01, correction block X is written into memory location A02, and so forth. Then, if memory array 120 were getting too full (i.e., if an overflow state were being approached), controller 110 could generate a read address Raddr that jumped from memory location A01 to memory location A03, thereby skipping the readout of correction block X. This type of “accelerating clock correction” effectively speeds up the reading of actual data to compensate for the slower read clock signal Rclk. Similarly, if memory array 120 were getting too empty (i.e., if an underflow state were being approached), controller 110 could simply repeat (i.e., not increment) read address Raddr when a memory location holding a correction block were reached. This type of “delaying clock correction” effectively slows down the faster read operations until sufficient new data blocks can be written into memory array 120.
Commonly, multiple transceivers, each with its own elastic buffer, may be operating in parallel to increase overall data throughput. The input data stream is broken into discrete data blocks and those data blocks pass through the parallel transceivers as an aggregate data stream. For example, FIG. 1b shows elastic buffers 100a and 100b in two transceivers configured to act as two “channels” for an input data stream Din. Elastic buffers 100a and 100b are substantially similar to elastic buffer 100 shown in FIG. 1a. Input data stream Din is split into two partial data streams Din_a and Din_b by feeding alternating data blocks from input data stream Din into each partial data stream. On every positive edge on write clock signal Wclk_a, a data block from partial data stream Din_a is sent to elastic buffer 100a and stored in memory array 120a at write address Waddr_a. Similarly, on each positive edge on write clock signal Wclk_b (which runs at the same frequency as write clock signal Wclk_a), a block from partial data stream Din_b is sent to elastic buffer 100b and is stored in memory array 120b at write address Waddr_b. Therefore, on each write clock cycle, two sequential data blocks from input data stream Din are stored.
Similarly, on every positive edge on read clock signals Rclk_a and Rclk_b (which are equal), two data blocks are read—one from read address Raddr_a in memory array 120a (as part of partial data stream Dout_a) and one from read address Raddr_b in memory array 120b (as part of partial data stream Dout_b). These two data blocks are reassembled into sequential data blocks in output data stream Dout. The use of two transceivers in parallel in this way doubles the data throughput compared with a single transceiver. For example, assume that input data stream is made up of consecutive data blocks da0, db0, da1, db1, da2, db2, da3, db3, etc. Ideally, the data stored in memory arrays 120a and 120b would be arranged as shown in Table 1.
TABLE 1ALIGNED MULTI-CHANNEL DATA120a120bA00: da0B00: db0A01: da1B01: db1A02: da2B02: db2A03: da3B03: db3......
Each read operation would read out the appropriate pair of data blocks—i.e., (da0 db0), (da1 db1), (da2 db2), (da3 db3), etc. Output data stream Dout would then be properly reassembled. Unfortunately, unequal transmission delays on the two channels and other effects can cause the data blocks stored in memory arrays 120a and 120b to be skewed significantly with respect to each other; i.e., the data blocks stored in memory array 120a may be offset (misaligned) from their corresponding data blocks stored in memory array 120b, as shown, for example, in Table 2.
TABLE 2MISALIGNED MULTI-CHANNEL DATA120a120bA00: da0B00: db1A01: da1B01: db2A02: da2B02: db3A03: da3B03: db4......
In this case, each read operation would read out mismatched pairs of data blocks—i.e., (da0 db1), (da1 db2), (da2 db3), (da3 db4), etc. Output data stream Dout would then no longer be an accurate recreation of input data stream Din. To provide a means for correcting this problem, partial input data streams Din_a and Din_b will typically include special “alignment sequences” consisting of one or more “alignment blocks” that define corresponding points in the separate channels. As with the aforementioned correction sequences, the length of an alignment sequence and the values of the alignment blocks are defined by the particular communications protocol being used. These predefined alignment sequences can therefore be used to correlate the data blocks stored in different elastic buffers. For clarity, the invention will be discussed with respect to alignment sequences consisting of a single data block. Note, however, that the same principles apply to alignment sequences including multiple data blocks, as the position of the leading data block in such a multi-block alignment sequence would determine the manner in which an associated alignment operation would be carried out.
For example, suppose that input data stream Din is made up of data blocks da0, db0, DA, DB, da1, db1, da2, db2, etc., where data blocks DA and DB represent alignment blocks. Table 3 shows a possible data misalignment that could result from such an input data stream.
TABLE 3MISALIGNED MULTI-CHANNEL DATA WITH ALIGNMENTDATA BLOCKS120a120bA00: da0B00: DBA01: DAB01: db1A02: da1B02: db2A03: da2B03: db3 . . . . . .
As with the misaligned data shown in Table 2, simply reading out the stored data depicted in Table 3 would result in a corrupted output data stream Dout. However, during normal operation of elastic buffers 100a and 100b, controllers 110a and 110b monitor partial input data streams Din_a and Din_b, respectively, and record the memory locations in which alignment blocks DA and DB, respectively, are stored. Alignment blocks DA and DB can then be used to perform a “channel bonding” (or “channel alignment”) operation to realign the stored data to generate output data stream Dout.
To perform a channel bonding operation, one of the elastic buffers is designated the master elastic buffer (in this case elastic buffer 100a), and all channel bonding operations are initiated by the master. Typically, the master elastic buffer (elastic buffer 100a) asserts a channel bonding signal CB_load a specified wait period (“channel bonding wait”) after alignment block DA is read out of memory array 120a. The channel bonding wait is a fixed number of data blocks that must be read, starting with the alignment block, before performing a channel bonding operation. The wait period gives each slave elastic buffer time to store the corresponding alignment block from its partial input data stream and establish the location of that alignment block as the current reference point for alignment. In response to channel bonding signal CB_load, each elastic buffer sets its read address to point to the memory location of its stored alignment block (or a memory location having a defined position relative to the memory location of the alignment block). For example, elastic buffer 100a sets read address Raddr_a to the address corresponding to the memory location (A01) in memory array 120a of alignment block DA, while elastic buffer 100b (the slave) sets read address Raddr_b to the address corresponding to the memory location (B00) in memory array 120b of alignment block DB. By matching up alignment blocks DA and DB in this manner, the channel bonding operation (sometimes referred to as “channel alignment”) forces subsequent read operations to read out properly matched data blocks from memory arrays 120a and 120b. Note that to maintain this data alignment, the master elastic buffer (100a) must also control the clock correction operations described previously for all the slave elastic buffers (100b).
The design and manufacture of the elastic buffer circuitry can be greatly simplified if the required clock speeds for writing and reading the elastic buffer can be reduced. One way to do this is to increase the buffer's width—i.e., the number of data blocks written or read per clock cycle. FIG. 1c shows a conventional elastic buffer 100c that is substantially similar to elastic buffer 100 shown in FIG. 1a, except that the memory locations in memory array 160 are addressed in two-block increments. Therefore, with each pulse of write clock signal Wclk, controller 110c generates a write address Waddr that corresponds to two adjacent memory locations in memory array 160, and two consecutive data blocks from input data stream Din are written into the designated memory locations. Similarly, a positive edge on read clock signal Rclk causes controller 110c to generate a read address Raddr that corresponds to two adjacent memory locations, and two stored data blocks are read out as part of output data stream Dout. Because multiple data blocks are written and read from elastic buffer 100c during each write or read cycle, respectively, elastic buffer 100c can be referred to as a “multi-block width” elastic buffer. Specifically, memory array 160 has a width of two data blocks, making it twice as wide as memory array 120 of elastic buffer 100.
By writing and reading multiple data blocks on each clock pulse, a multi-block width elastic buffer can significantly increase data throughput over a elastic buffer that only writes or reads a single data block per clock cycle. For example, assume that input data stream Din is made up of consecutive data blocks da0, db0, da1, db1, da2, db2, da3, db3, etc. In response to a positive edge on write clock Wclk, controller 110c would generate a write address Waddr corresponding to, for example, memory locations A00 and B00. Data block da0 would then be stored in memory location A00, and data block db0 would be stored in memory location B00. On the next rising edge of write clock Wclk, data block da1 would be stored in memory location A01 and data block db1 would be stored in memory location B01. Each subsequent write operation would store two more data blocks from input data stream Din. In a similar manner, each positive edge on read clock signal Rclk reads out the data blocks stored in two adjacent memory locations. Therefore, elastic buffer 100c has twice the data throughput of elastic buffer 100 shown in FIG. 1a, and therefore can operate at half the clock speed, thereby decreasing design and manufacturing complexity.
However, because each write address Waddr and read address Raddr generated by controller 110c corresponds to a memory location having a multi-block width, elastic buffer 100c requires that any correction sequence in input data stream Din must be “full width”—i.e., the correction sequence must occupy the full width of memory array 160. This limitation can make an increased-width elastic buffer, such as elastic buffer 100c, incompatible with communications protocols that incorporate correction sequences having lengths that are not integral multiples of the elastic buffer width. For example, some modern high-speed communications protocols, such as XAUI (10 gigabit extended Attachment Unit Interface), use a one-byte correction sequence. Each time such a one-byte correction sequence is written into a memory location in memory array 160, a non-correction data block (byte) could be written into an adjacent memory location (assuming that elastic buffer 100c has a width of two bytes). A clock correction operation using that correction block would then either delete the adjacent data block or add copies of the adjacent data block to the output data stream.
In addition to requiring that the correction sequence length match the elastic buffer width, correct operation of the elastic buffer would also require that a single correction sequence be properly aligned within memory array 160. Assume for example that the correction sequence length and elastic buffer width both are two data blocks. Suppose that input data stream Din is made up of consecutive data blocks da1, X1, X2, and db2, where data blocks X1 and X2 represent a correction sequence. Further suppose data blocks da1, X1, X2, and db2 are written into memory locations A00, B00, A01, and B01, respectively. The correction sequence (X1 X2) occupies memory locations (B00 and A01, respectively) addressed by two different values of read address Raddr, and there is no way to manipulate read address Raddr to effect clock correction. Skipping or repeating an address will always cause the spurious omission or insertion of either data block da1 or data block db2.
A similar issue arises when elastic buffers having multi-block widths are used in a multi-channel configuration. FIG. 1d shows elastic buffers 100d and 100e in two transceivers configured to act as two channels for an input data stream Din. Elastic buffers 100d and 100e are substantially similar to elastic buffer 100c shown in FIG. 1c, each having a width of two data blocks. Therefore, input data stream Din is split into two partial data streams Din_d and Din_e by feeding alternating pairs of data blocks from input data stream Din into each partial data stream. On every positive edge on write clock signal Wclk_d, two data blocks from partial data stream Din_d are sent to elastic buffer 100d and stored in memory array 160d at write address Waddr_d. Similarly, on each positive edge on write clock signal Wclk_e (which runs at the same frequency as write clock signal Wclk_d), two data blocks from partial data stream Din_e are sent to elastic buffer 100e and are stored in memory array 160e at write address Waddr_e. Therefore, on each write clock cycle, four sequential data blocks from input data stream Din are stored. For example, assume that partial input data stream Din_d includes data blocks d1, d2, DD, d3, d4, d5, while partial input data stream Din_e includes the data blocks e1, e2, DE, e3, e4, e5, where DD and DE are the alignment blocks. Assume further that due to misalignment of data between the channels, memory arrays 160d and 160e are written according to Table 4.
TABLE 4MISALIGNED MULTI-CHANNEL DATA IN MULTI-BLOCKWIDTH ELASTIC BUFFERS160d160eA00: d1C00: —B00: d2D00: e1A01: DDC01: e2B01: d3D01: DEA02: d4C02: e3B02: d5D02: e4 .C03: e5 .D03: — . . . .
Because the data stored in memory arrays 160d and 160e can only be addressed in specific two-block increments, output data stream Dout cannot be placed in proper alignment. At some positive edge of read clock signal Rclk_d, data blocks d4 and d5 will be written to partial output data stream Dout_d. At the same positive edge of read clock signal Rclk_e, data blocks e4 and e5 should be written to partial output data stream Dout_e, but no value of read address Raddr_e can cause these two blocks to be written out together.
Use of a multi-block width elastic buffer may be desirable to simplify the buffer design by reduction of the clock speeds. At the same time, it may be required that a transceiver (including the elastic buffer) support a variety of communications protocols. This is particularly desirable, for example, for transceivers that are to be embedded in a programmable logic device such as a field-programmable gate array (FPGA), which is intended to be configurable for a broad range of applications. However, as described above, existing multi-block width elastic buffers are limited to protocols having correction sequence lengths matching the width of the buffer(s), and for which clock correction and channel alignment operations are properly timed to prevent unachievable data block sequences on output data stream Dout. Accordingly, it is desirable to provide an elastic buffer having a multi-block width that overcomes these limitations.