This invention relates generally to the field of parallel computing and more particularly to a method of providing high performance recoverable communication between the nodes in a parallel computing system.
As it is known in the art, large scale parallel computers have historically been constructed with specialized processors and customized interconnects. The cost of building specialized processors in terms of components and time to market caused many computer manufacturers to re-evaluate system designs. Currently many vendors in the market are attempting to provide performance similar to that of custom designs using standard processors and standard networks. The standard processors and networks are generally marketed and sold as clustered computer systems.
By using standard components and networks, clustered systems have the advantage of providing a parallel computing system having a much lower cost design at a decreased time to market. However, because the standard network protocol is used, a communication overhead is incurred that translates into poor overall parallel system performance.
The source of much of the performance loss associated with standard networks arises because the currently existing network hardware is incapable of guaranteeing message delivery and order. Because these guarantees are not provided by network hardware, software solutions are required to detect and handle errors incurred during message transmission.
Network software typically comprises many layers of protocol. These network layers are executed by the operating system and work together in an attempt to detect dropped messages, transmission errors and to recover from the above events, among others. Because the operating system is linked to the network software, there is no provision for direct access by a given application program to the network. Accordingly, because there is no direct link between the application program and the network performance is further reduced due to the overhead of the network software interface.
One method for providing high performance communication was described in U.S. Pat. No. 4,991,079, entitled "Real-Time Data Processing System", by Dann et al, assigned to Encore Computer Corporation, issued on Feb. 5, 1991 (hereinafter referred to as the Encore patent).
The Encore patent describes a write-only reflective memory system that provides a form of networking better suited for parallel computing than standard networks, called a write-only reflective memory data link. The reflective memory system includes a real time data processing system in which each of a series of processing nodes is provided with its own data store partitioned into a local section and a section which is to be shared between the nodes. The nodes are interconnected by a data link. Whenever a node writes to an address in the shared portion of the data store, the written data is communicated (i.e. `reflected`) to all of the nodes via the data link. The data in each address of the shared data store can only be changed by one of the nodes which has been designated as a master node for the corresponding address. Because each address containing shared data can only be written to by one node, collisions between different nodes attempting to change a common item of data cannot occur.
The Encore system, although it describes a method for providing high performance parallel computing, provides no mechanism for ensuring recoverable communication. Accordingly, because there are no hardware mechanisms for providing error recovery, the support must still be provided by software. As a result, the Encore system incurs a similar communication overhead that translates into reduced parallel system performance.