1. Field of the Invention
The present invention generally relates to a computer system with multiple processors. More particularly, the invention relates to a distributed shared memory multiprocessing computer system that supports a high performance, scalable and efficient input/output (“I/O”) port protocol to connect to I/O devices.
2. Background of the Invention
Distributed computer systems typically comprise multiple computers connected to each other by a communications network. In some distributed computer systems, networked computers can access shared data. Such systems are sometimes known as parallel computers. If a large number of computers are networked, the distributed system is considered to be “massively” parallel. One advantage of a massively parallel computer is that it can solve complex computational problems in a reasonable amount of time.
In such systems, the memories of the computers are collectively known as a Distributed Shared Memory (“DSM”). It is a problem to ensure that the data stored in the DSM is accessed in a coherent manner. Coherency, in part, means that only one processor can modify any part of the data at any one time, otherwise the state of the system would be nondeterministic.
Recently, DSM systems have been built as a cluster of Symmetric Multiprocessors (“SMP”). In SMP systems, shared memory can be implemented efficiently in hardware since the processors are symmetric (e.g., identical in construction and in operation) and operate on a single, shared processor bus. Symmetric multiprocessor systems have good price/performance ratios with four or eight processors. However, because of the specially designed bus that makes message passing between the processors a bottleneck, it is difficult to scale the size of an SMP system beyond twelve or sixteen processors.
It is desired to construct large-scale DSM systems using processors connected by a network. The goal is to allow processors to efficiently share the memories so that data fetched by one program executed on a first processor from memory attached to a second processor is immediately available to all processors.
DSM systems function by using message passing to maintain the coherency of the shared memory distributed throughout the multiprocessing computer system. A message is composed of packets that contain identification information and data. Control of message routing is distributed throughout the system and each processor visited by a message traveling through the multiprocessing computer system controls the routing of the message through it. Message passing can reduce system performance since delays in transmission of message packets can slow down program execution. Delays in transmission can occur because of high latency due to congestion in the network (i.e., many messages trying to go through the limited physical connections of the networks). This type of congestion can cause tremendous performance degradation that can result in high overall program execution times.
Each processor of a distributed shared memory computer system typically connects to an I/O bridge/Bus Interface ASIC (referred to as “I/O bridge ASIC”) that permits the processor to gain access to input or output devices. Such devices may be keyboards, monitors, disk drives, hard drives, CD-ROM, tape backup systems, and a host of other peripheral I/O devices. The processor typically implements an I/O port protocol that interfaces the processor to the external I/O device through the I/O bridge ASIC. The I/O port protocol performs many operations between the processor and external I/O devices across the I/O bridge ASIC. These operations include direct memory access (“DMA”) read streams, DMA write streams, processor access to I/O devices, I/O device interrupt handling, coherence for I/O translation lookaside buffers (“TLB”), and peer-to-peer I/O communication between two different I/O devices.
Although prior art I/O port protocols used between processors and their I/O bridge ASICs have been suitable for single processor computer systems or twelve to sixteen node single bus SMP systems, these I/O port protocols lacked the ability to allow efficient and fast I/O port operations for a scalable DSM multiprocessing computer system. DSM computer systems which used the computer systems internal bus protocol could not take advantage of the memory and cache coherence protocols because of implementation differences between the internal bus protocol and coherence protocol. Thus, an I/O access required translation between the two protocols resulting in complex translation hardware, increased implementation cost and reduced computer system performance. Therefore, it is desired to implement an I/O port protocol compatible with a DSM computer system memory and cache coherence protocol that permits I/O port operations to take place in the DSM computer system efficiently, quickly and easily while maintaining the coherency of the data accessed by I/O port devices.