1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to translating data between clock domains in computer systems.
2. Description of the Related Art
Computer systems are increasingly including larger numbers of components for improving performance. For example, high performance computer systems frequently including multiple processors (or CPUs) for concurrent execution of multiple programs or multiple portions of programs. Furthermore, computer systems may include additional input/output (I/O) devices for communicating with other computer systems or providing programs and data for execution.
As the number of devices in the computer system increases, it becomes more difficult to clock all of the devices in the computer system using clocks derived from a single clock source. Clocking numerous devices from a single clock source is difficult in terms of routing the clock lines from the source to all of the devices in a reasonably skew-controlled manner, driving the relatively large load of all the devices from the single source, etc. Accordingly, it is desirable to allow multiple clock sources within the computer system, even if clocks from devices interconnected via the same interface are clocked by clocks derived from different clock sources.
Since the clocks are derived from different sources, the clocks may experience dynamic variation with respect to each other. Dynamic variation may occur, e.g., due to temperature differences, voltage differences, accumulated phase error in a phase locked loop, or noise differences. Accordingly, clocks derived from different sources form different clock domains. Generally, a xe2x80x9cclock domainxe2x80x9d refers to the circuits which are clocked using clocks derived from a single clock source. A xe2x80x9cclockxe2x80x9d or xe2x80x9cclock signalxe2x80x9d is a signal repeating at a regular rate, or period. The number of times the period repeats per second is the frequency of the clock.
Devices which communicate via an interface but which belong to different clock domains may experience difficulties in communicating upon the interface. Generally, information conveyed via the interface is translated from one clock domain to another. A buffer (e.g. a FIFO buffer) may be used to translate information from one clock domain to the other. Buffer entries are allocated in response to the source clock signal corresponding to the source clock domain and are deallocated in response to the target clock signal corresponding to the target clock domain. The amount of time a particular item of information within the buffer is valid is increased by having multiple buffer entries, allowing time to process each item and time for the variations between the source and target clocks. As used herein, the term xe2x80x9cbufferxe2x80x9d refers to one or more clocked storage elements (e.g. registers, flops, latches, RAM arrays, etc.).
Unfortunately, buffers alone do not solve the problem of a persistent difference in frequencies between the source clock and the target clock, even though the source and target clocks may nominally be operating at the same frequency. For example, the source clock may be operating at a slightly greater frequency than the target clock. The source may, e.g., provide one item of information per (source) clock on the interface and the target may consume one item of information per (target) clock from the interface. In this case, over some time interval, the source provides more items of information than the target can consume. While an additional buffer entry within the FIFO buffer could be allocated to handle the first occurrence of the additional data, the second occurrence would still not be handled. It is difficult and expensive to eliminate all frequency difference between clock signals from different clock domains. Furthermore, it is undesirable to stall the transfer of information upon the interface when the source clock is operating at a higher frequency than the target clock to allow for the target device to process the additional item. Such stalling is undesirable from a performance standpoint, and may introduce undesired complexities as well. Accordingly, a solution to communicating between devices from different clock domains which handles variations in clock frequency between the clock domains without inserting delays on the interface between the devices is needed.
The problems outlined above are in large part solved by an apparatus as described herein. Generally, the apparatus is configured to monitor the source and target clocks (e.g., receive and transmit clocks, respectively, each from different clock domains) to determine if the respective frequencies of the clocks lead to more data being received by the buffer used to communicate between the two devices than is transmitted from that buffer. Upon detecting such a situation, a staging buffer is used to pre-read entries from the buffer and transfer these entries to the output of the buffer. Effectively, the transmit data pipeline may be dynamically extended by a stage comprising the staging buffer. The staging buffer continues to be used until a synchronization event occurs. A synchronization event allows for the transmit logic to xe2x80x9ccatch upxe2x80x9d by, e.g., processing two items of information from the buffer concurrently. Advantageously, the receive clock frequency may exceed the transmit clock frequency by an amount dependent upon the minimum frequency of synchronization events, and the staging buffer operated as described herein may account for the frequency differences. Furthermore, no stalling on the interface between devices may be required as the items continue to be processed from the buffer.
Broadly speaking, an apparatus for translating data from a first clock domain to a second clock domain is contemplated. The apparatus comprises a buffer, a staging buffer, and control logic. The buffer comprises a plurality of entries and is clocked by a first clock signal corresponding to the first clock domain. Each of the plurality of entries is configured to store a datum being translated from the first clock domain to the second clock domain. Coupled to receive a first datum read from the buffer, the staging buffer is clocked by a second clock signal corresponding to the second clock domain. Coupled to receive the first clock signal and the second clock signal, the control logic is configured to monitor the first clock signal and the second clock signal to determine if, during a period of the second clock signal, an amount of data received by the buffer exceeds an amount of data read from the buffer. Additionally, the control logic is configured to detect a synchronization event. The control logic is configured to selectively forward one or both of: (i) the first datum, and (ii) a second datum read from the buffer responsive to determining the amount of data and to detecting the synchronization event. Moreover, a computer system including a first processing node and a second processing node including the apparatus is contemplated. The first processing node is coupled to a link including one or more data lines and a clock line, and is configured to drive data upon the data lines and a first clock signal upon the clock line. The second processing node is coupled to the link, and is configured to receive the data according to the first clock signal and to process the data according to a second clock signal derived from a different source than the first clock signal.
Additionally, a method for translating data from a first clock domain to a second clock domain is contemplated. Data is received into a buffer according to a first clock signal corresponding to the first clock domain. A first datum is read from the buffer into a staging buffer according to a second clock signal corresponding to the second clock domain. A second datum is from the buffer according to the second clock signal. The first datum and the second datum are selectively transmitted into the second clock domain responsive to determining if, during a period of the second clock signal, an amount of data received by the buffer exceeds an amount of data read from the buffer and further responsive to detecting a synchronization event.