1. Field of the Invention
This invention relates to the field of computer busses.
2. Background
A major issue in the information age is the speed at which data can be transferred between points. This issue exists in computers both for transferring data between memory and a central processing unit, and for transferring data between devices and/or memory. The issue also exists for transferring data between computers or digitized voice data between telephone units.
As processor speed and network traffic has increased, the physical limitations of traditional interconnects have become more apparent. With commonly available home computers operating at a clock speed of more than 500 MHz, the computing bottleneck is generally a result of moving data within the system and not as a result of processing the data. Rambuss technology is one approach that addresses a similar problem in providing a high bandwidth interconnection between a processor and memory. Other approaches exist for generalized high speed interconnects such as the scaleable coherent interface (SCI).
One problem is that vast amounts of data need to be transported from one place to another as quickly as possible with minimal latency and maximum throughput.
Another problem in the known art is within multiple processor systems that require copies of the same data in multiple caches. Existing cache-coherency protocols on single-bus multiprocessor systems are limited by the physics of the single bus. That is some of the desirable bus characteristics are incompatible: high bandwidth, low latency, and unlimited length. Thus, increasing the number of processors accessing the bus increases the memory/processor traffic, increases the time required to arbitrate the bus, and physically lengthens the bus (thus decreasing the maximum rate at which it can be clocked, and therefore decreasing the bus"" bandwidth).
Single bus multiprocessor systems generally use snooping protocols to maintain cache coherency. The caches on these systems monitor xe2x80x9csnoopxe2x80x9d bus transactions from other caches to determine whether the transaction affects the cache""s contents (effectively, each bus transaction is broadcast to every cache). Replacing the single bus with a point-to-point interconnection to avoid the single bus limitations also removes the broadcast mechanism that is used by the snooping protocols. Thus, the snooping protocols have been considered as not economically feasible for large multiprocessor designs.
It would be advantageous to provide a high speed, highly pipelined, interconnect between one or more processors, caches and memories that supports multi-gigahertz clock rates and four or more times the number of cache agents as is currently supported by known system busses that snoop bus transactions.
The invention discloses methods and apparatus for broadcasting information across an interconnect that includes a plurality of nodes each connected to its adjacent node(s) using one or more links. The nodes can emit cells containing transaction sub-actions onto the links. As a node receives a cell the node retransmits the cell onto other links as the cell is being received. Thus, reducing the latency imposed by the node. The node also captures the transaction sub-action while it retransmits the cell. The node responds to the transaction sub-action by manipulating shared handshake lines that are bussed with the other nodes. The invention enables snooping cache protocols to be successfully used in a larger multi-processor computer system than the prior art.
The foregoing and many other aspects of the present invention will no doubt become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiments that are illustrated in the various drawing figures.