1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to computer systems wherein input/output (I/O) operations access memory.
2. Description of the Related Art
Generally, personal computers (PCs) and other types of computer systems have been designed around a shared bus system for accessing memory. One or more processors and one or more input/output (I/O) devices are coupled to memory through the shared bus. The I/O devices may be coupled to the shared bus through an I/O bridge which manages the transfer of information between the shared bus and the I/O devices, while processors are typically coupled directly to the shared bus or are coupled through a cache hierarchy to the shared bus.
Unfortunately, shared bus systems suffer from several drawbacks. For example, the multiple devices attached to the shared bus present a relatively large electrical capacitance to devices driving signals on the bus. In addition, the multiple attach points on the shared bus produce signal reflections at high signal frequencies which reduce signal integrity. As a result, signal frequencies on the bus are generally kept relatively low in order to maintain signal integrity at an acceptable level. The relatively low signal frequencies reduce signal bandwidth, limiting the performance of devices attached to the bus.
Lack of scalability to larger numbers of devices is another disadvantage of shared bus systems. The available bandwidth of a shared bus is substantially fixed (and may decrease if adding additional devices causes a reduction in signal frequencies upon the bus). Once the bandwidth requirements of the devices attached to the bus (either directly or indirectly) exceeds the available bandwidth of the bus, devices will frequently be stalled when attempting access to the bus, and overall performance of the computer system including the shared bus will most likely be reduced.
On the other hand, distributed memory systems lack many of the above disadvantages. A computer system with a distributed memory system includes multiple nodes, two or more of which are coupled to different memories. The nodes are coupled to one another using any suitable interconnect. For example, each node may be coupled to each other node using dedicated lines. Alternatively, each node may connect to a fixed number of other nodes, and transactions may be routed from a first node to a second node to which the first node is not directly connected via one or more intermediate nodes. A memory address space of the computer system is assigned across the memories in each node.
In general, a xe2x80x9cnodexe2x80x9d is a device which is capable of participating in transactions upon the interconnect. For example, the interconnect may be packet based, and the node may be configured to receive and transmit packets. Generally speaking, a xe2x80x9cpacketxe2x80x9d is a communication between two nodes: an initiating or xe2x80x9csourcexe2x80x9d node which transmits the packet and a destination or xe2x80x9ctargetxe2x80x9d node which receives the packet. When a packet reaches the target node, the target node accepts the information conveyed by the packet and processes the information internally. A node located on a communication path between the source and target nodes may relay the packet from the source node to the target node.
Distributed memory systems present design challenges which differ from the challenges in shared bus systems. For example, shared bus systems regulate the initiation of transactions through bus arbitration. Accordingly, a fair arbitration algorithm allows each bus participant the opportunity to initiate transactions. The order of transactions on the bus may represent the order that transactions are performed (e.g. for coherency purposes). On the other hand, in distributed systems, nodes may initiate transactions concurrently and use the interconnect to transmit the transactions to other nodes. These transactions may have logical conflicts between them (e.g. coherency conflicts for transactions involving the same address) and may experience resource conflicts (e.g. buffer space may pot be available in various nodes) since no central mechanism for regulating the initiation of transactions is provided. Accordingly, it is more difficult to ensure that information continues to propagate among the nodes smoothly and that deadlock situations (in which no transactions are completed due to conflicts between the transactions) are avoided.
A computer system may include a processing portion with nodes performing processing functions, and an I/O portion with nodes implementing various I/O functions. Two or more of the nodes of the processing portion may be coupled to different memories. The processing portion may operate in a xe2x80x9ccoherentxe2x80x9d, fashion such that the processing portion preserves the coherency of data stored within the memories. On the other hand, as no memory is located within the I/O portion, the I/O portion may be operated in a xe2x80x9cnon-coherentxe2x80x9d fashion. Packets used to convey data within the processing and I/O portions need not have the same formats. However, the I/O functions within the I/O portion must be able to generate memory operations (e.g., memory read and write operations) which must be conveyed from the I/O portion into the processing portion. Similarly, the processing functions within the processing portion must be able to generate I/O operations (e.g., I/O read and write operations) which must be conveyed from the processing portion into the I/O portion. It would thus be desirable to have a computer system which implements a system and method for conveying packets between a coherent processing portion of a computer system and a non-coherent I/O portion of the computer system.
A computer system is presented which implements a system and method for conveying packets between a coherent processing subsystem and a non-coherent input/output (I/O) subsystem. The processing subsystem includes a first processing node coupled to a second processing node via a coherent communication link. The first and second processing nodes may each include a processor preferably executing software instructions (e.g., a processor core configured to execute instructions of a predefined instruction set). The first processing node includes a host bridge which translates packets moving between the processing subsystem and the I/O subsystem. The I/O subsystem includes an I/O node coupled to the first processing node via a non-coherent communication link. In one embodiment, the I/O subsystem includes multiple I/O nodes coupled via non-coherent communication links one after another in series or daisy chain fashion. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.).
The coherent and non-coherent communication links are physically identical. For example, the coherent and non-coherent communication links may have the same electrical interface and the same signal definition. In one embodiment, the coherent and non-coherent communication links are bidirectional communication links made up of two unidirectional sets of transmission media (e.g., wires). Each communication link may include a first set of three unidirectional transmission media directed from a first node to a second node, and a second set of three unidirectional transmission media directed from the second node to the first node.
Both the first and second sets may include separate transmission media for a clock (CLK) signal, a control (CTL) signal, and a command/address/data (CAD) signal. In a preferred embodiment, the CLK signals serves as a clock signal for the CTL and CAD signals. A separate CLK signal may be provided for each 8-bit byte of the CAD signal. The CAD signal is used to convey control packets and data packets. Types of control packets may include command packets and response packets. The CAD signal may be, for example, 8, 16, or 32 bits wide, and may thus include 8, 16, or 32 separate transmission media. The CTL signal may be asserted when the CAD signal conveys a command packet, and may be deasserted when the CAD signal conveys a data packet. The CTL and CAD signals may transmit different information on the rising and falling edges of the CLK signal. Accordingly, two data units may be transmitted in each period of the CLK signal.
The host bridge within the first processing node receives a non-coherent packet from the I/O node via the non-coherent communication link and responds to the non-coherent packet by translating the non-coherent packet to a coherent packet. The host bridge then transmits the coherent packet to the second processing node via the coherent communication link. The coherent and non-coherent packets have identically located command fields, wherein the contents of the command field identifies a command to be carried out. The translating process includes copying the contents of the command field of the non-coherent packet to the command field of the coherent packet.
The coherent packet may also include a destination node field for storing destination node identification information and a destination unit field for storing destination unit identification information. The first processing node may have an address map including a list of address ranges and corresponding node identifiers and unit identifiers. The non-coherent packet may include address information. The translating process may include using the address information to retrieve a destination node identifier and a destination unit identifier from the address map, wherein the destination node identifier identifies the destination node, and wherein the destination unit identifier identifies the destination unit. The translating process may also include storing the destination node identifier within the destination node field of the coherent packet, and storing the destination unit identifier within the destination unit field of the coherent packet.
The coherent packet may also include a source tag field for storing coherent packet identification information. The translating process may include: (i) obtaining a coherent source tag for the coherent packet from the first processing node, wherein the coherent source tag identifies the coherent packet, and (ii) storing the coherent source tag within the source tag field of the coherent packet.
The non-coherent packet may also include a unit identifier which identifies the I/O node as the source of the non-coherent packet, and a non-coherent source tag which identifies the non-coherent packet. The host bridge may include a data buffer. The translating process may include storing the coherent source tag and the corresponding unit identifier and the non-coherent source tag within the data buffer.
The host bridge may receive a coherent packet from the second processing node via the coherent communication link. The host bridge may be configured to respond to the coherent packet by translating the coherent packet to a non-coherent packet and transmitting the non-coherent packet to the I/O node via the non-coherent communication link. Again, the coherent and non-coherent packets have identically located command fields, and the translating process includes copying the contents of the command field of the coherent packet to the command field of the non-coherent packet.
The non-coherent packet may include a unit identification field for storing destination unit identification information, and a source tag field for storing non-coherent packet identification information. The translating process may include using the coherent source tag to obtain a unit identifier and a non-coherent source tag from the data buffer, wherein the unit identifier identifies the I/O node as the destination of the non-coherent packet, and wherein the non-coherent source tag identifies the non-coherent packet. The translating process may also include storing the unit identifier within the unit identification field of the non-coherent packet, and storing the non-coherent source tag within the source tag field of the non-coherent packet.
A first method for use in a computer system includes the host bridge within the first processing node receiving a non-coherent packet from the I/O node via the non-coherent communication link. The host bridge translates the non-coherent packet to a coherent packet, wherein the coherent and non-coherent packets have identically located command fields. As described above, the translating includes copying the contents of the command field of the non-coherent packet to the command field of the coherent packet. The host bridge transmits the coherent packet to the second processing node via the coherent communication link, wherein the coherent and non-coherent communication links are physically identical.
As described above, the translating may also include using address information of the non-coherent packet and the address map described above to determine a destination node identifier and a destination unit identifier of the coherent packet, wherein the destination node identifier identifies the destination node, and wherein the destination unit identifier identifies the destination unit. The translating may also include storing the destination node identifier within the destination node field of the coherent packet, and storing the destination unit identifier within the destination unit field of the coherent packet.
As described above, the translating may also include: (i) obtaining a coherent source tag from the first processing node, wherein the coherent source tag identifies the coherent packet, and (ii) storing the coherent source tag within a source tag field of the coherent packet.
A second method for use in a computer system may include the host bridge receiving a coherent packet from the second processing node via the coherent communication link. The host bridge translates the coherent packet to a non-coherent packet, wherein the coherent and non-coherent packets have identically located command fields. The translating includes copying the contents of the command field of the coherent packet to the command field of the non-coherent packet. The host bridge transmits the non-coherent packet to the I/O node via the non-coherent communication link, wherein the coherent and non-coherent communication links are physically identical.
As described above, the translating may also include using the coherent source tag of the coherent packet to retrieve the unit identifier and the non-coherent source tag from the data buffer within the host bridge, storing the unit identifier within the unit identification field of the non-coherent packet, and storing the non-coherent source tag within the source tag field of the non-coherent packet.