The present invention relates generally to distributed computer systems, and more specifically the invention pertains to a broadcast protocol system that reduces the number of messages required for effective communication between remotely located computers in a distributed computer system.
A protocol system may be defined as a set of rules governing the format and timing of message exchanges to control data movements and correct errors. It is important to ensure that the protocol is valid, makes sense, works, and is adhered to by all users of the network in question. Many distributed computer systems use a communication mechanism that is physically a broadcast medium, such as the Ethernet or a packet radio system. Other common communication media, such as the token ring, could function as broadcast media, even though they are not normally so used. The advantage of a broadcast communication medium is that it makes it physically possible to distribute a message simultaneously to several destinations.
There are important activities in a distributed computer system that involve many processors simultaneously and that would benefit from broadcast communication. Among these are scheduling and load balancing, synchronization, access to distributed information, update and commit for distributed databases, and transaction logging.
Existing communication protocols do not allow distributed computer systems to make use of this broadcast capability, but rather require all messages to be point-to-point, from a single source to a single destination. If the nature of the application is such that broadcast communication is appropriate, existing systems must send many individual messages and receive corresponding individual acknowledgements. In a network of N nodes, this results in a total of 2N messages, when perhaps a single broadcast message might have sufficed. The high cost of broadcast communication is not only wasteful of the communication resource, but it also limits the size of the distributed system by saturating the communication system and discourages the use of truly distributed algorithms because of their unnecessarily high communication cost.
Reliable transmission of a message requires the ability to retransmit the message because of damage or loss in transit. Within the ISO protocol hierarchy, the primary responsibility for ensuring this reliable transmission across the broadcast communication medium lies with the link-level communication protocol. This protocol is directed towards that level of the hierarchy. Consequently, the protocol provides only services appropriate to the link level, in contrast to other atomic broadcast protocols that ignore the hierarchy and are designed to be entirely self contained. For example, our protocol can determine whether a node has acknowledged receiving a message, but has no responsibility for network membership or network reconfiguration following a failure. Some of what we describe below may also be relevant to other levels in the protocol hierarchy, particularly the transport level that ensures reliable transmission between hosts.
Most existing link level protocols use positive acknowledgements, in which the recipient of a message explicitly transmits an acknowledgement of its receipt, either as a separate message or as part of another message. The sender of the original message uses a timeout to trigger retransmission if no acknowledgement is received from the recipient. In a broadcast context, such protocols require individual acknowledgments from each recipient, even if it is possible (which it usually is not) to take advantage of the broadcast medium to disseminate the initial message to all recipients. Thus, broadcasting with positive acknowledgements could reduce the number of messages from 2N to N+1, which is still far from taking full advantage of the broadcast medium.
The task of providing a new broadcast protocol system which uses a broadcast medium with a reduced number of messages is alleviated, to some extent, by the systems disclosed in the following U.S. Patents, the disclosures of which are incorporated herein by reference:
U.S. Pat. No. 3,824,547 issued to Green et al;
U.S. Pat. No. 3,876,979 issued to Winn et al;
U.S. Pat. No. 4,725,834 issued to Chang et al;
U.S. Pat. No. 4,745,593 issued to Stewart; and
U.S. Pat. No. 4,807,224 issued to Naron et al.
Chang et al describe a reliable broadcast protocol for a token passing bus network. In the patented system if some receiver in the broadcast group does not receive the broadcast message it can request a retransmission. The primary or token cite receiver, not the source station, retransmits the broadcast message to assure a reliable broadcast system.
Green et al are concerned with a bidirectional data communication system in which positive and negative acquisition signals are utilized to inform a transmitting station as to any errors in the reception of a message. An arrangement for testing a package switching network is set forth in the Stewart patent.
In the system of the Winn et al patent an error in transmission results in the sending of a retransmission request. All messages subsequent to the last sequence number correctly received before the error was detected are retransmitted. Naron et al distribute data to an unlimited number of remote receiver installations.
In addition to the system described above, the most detailed existing description of reliable broadcast protocol is that by Chang and Maxemchuk in their article "Reliable broadcast protocols," ACM Transactions on Computer Systems 2, 3 (August, 1984), 251-273. Their protocol requires that all messages pass through an intermediary node, called the token site. A node wishing to broadcast a message must communicate it to the token site, using a positive-acknowledgement protocol. Using a negative acknowledgement protocol, the token site then broadcasts the message to all recipients; any missing messages are detected by gaps in the sequence. The use of a single common intermediary makes the negative-acknowledgement technique more effective. A complex token passing protocol is used to detect failures at the token site, to select a new token site, and to retransmit messages affected by the failure. Although two messages and one acknowledgement are required for every message broadcast in the absence of errors, the token passing protocol can, in fact, add significantly to the number of messages if transmission errors are frequent.
In view of the foregoing discussion, it is apparent that their remains a need to provide a broadcast protocol system that reduces the number of messages between remotely located computers. The present invention is intended to satisfy that need.