FIG. 1 shows a first device (SLOW-DEVICE) 110 connected to a second device (OTHER-DEVICE) 120 by bus lines 131-133. Bus lines 131-132 carry control signals (READY) between the devices, and line 133 carries data signals from device 110 to device 120. In other words, device 110 is a source of data, and device 120 is a consumer of data. In addition, bus lines can also carry timing signals in any number of well known ways. The timing signals are typically generated from clock cycles.
The bus lines 131-133 use, for example, the well known industry standard PCI protocol. As a characteristic, the PCI bus applies an aggressive setup and clock-to-out requirements on its protocol control signals. This makes it difficult for relatively slow devices to process control signals received from a device operating at a substantially different rate. An example device is implemented using slow circuit technologies such a field programmable gate array (FPGA).
In particular, during target memory read and master memory write operations that have multiple data phases, i.e., "bursts," the slow devices 110 will have a difficult time processing bus control signals in a manner that allows the slow device to reliably decide whether or not the other device is ready to receive a next data phase on each successive clock cycle.
Consider FIG. 1. The key difficulty is processing the bus control signals (READY) 131-132 respectively generated by the slow and other device. An ADVANCE signal 111 to a multiplexer 115 is used to decide whether an output register 112 can be loaded with next data 113, e.g., both devices must be ready.
In order to produce the ADVANCE signal 111 correctly it must be determined whether or not the output register still contains current data 114. If the other device 120 did not accept the current data 114, then the current data must be retained in the output register 112.
However, if the slow device asserts READY 131 and the other device 120 accept the data, then the next data 113 must be loaded into the output register 112 so that the data can be found on the data line 133. In this case, the other device 120 can accept the data on the next clock cycle.
The standard PCI bus protocol commits each data phase on the cycle that data, e.g., a 32 or 64 bit word, is transferred onto the bus. However, many cohesive data transfers use larger bursts of words, for example, network packets, or disk blocks. Here, the data is only meaningful when the entire burst has been transferred.
Previous approaches rely on maintaining a rigorous correctness at the bus protocol level in the slow device. The slow device cannot permit itself to guess what the receiver's response will be. Wrong guesses will lead to incorrect data transfers with no higher level mechanism for detecting or correcting such incorrect transfers.
Therefore, in the prior art, slow devices introduce wait states on the bus in the form of a delay cycle every clock cycle. That is the slow device waits one complete extra clock cycle on every cycle that transfers data so the ready decisions will always be correct. Introducing wait states on the bus disrupts the flow of bursts of data and reduces bus bandwidth. Performance is compromised. Implementing the slow device in faster circuit technology compromises cost, neither solution is satisfactory.
Therefore, it is desired to achieve maximum or close to maximum speed data transfers for bursts of data from a slow device without any delays or wait states.