1. Field of the Invention
The present invention relates generally to network communication systems and associated communication protocols, and more particularly to an error-correcting communication method for transmitting data packets in a network communication system.
2. Description of the Prior Art
In network communication systems having a plurality of communicating endpoints, i.e., devices or units connected to a bus, such as the IEEE Standard 1394 High Performance Serial Bus, or devices or units connected via other transmission media besides a bus, data can be corrupted during the transmission, or be sent to an endpoint which is busy because its processing speed is slower than the transmitting endpoint or because it is simultaneously processing transmissions from other endpoints. The provisions in prior art error-correcting communication protocols to recover from these events require significant complexity in their implementation, as well as additional processing overhead (in processor cycles) in the handling of every communicated packet (even though most of the packets are communicated without a problem). The additional burden to support the complexity and processing overhead adds to the expense of implementing, and limits the throughput of, the network communication system.
Prior art error-correcting communications protocols, such as Link Access Procedure on the D-channel (LAPD), typically require the receiving endpoint to separately acknowledge each received packet. Because the transmitted packet may be lost (e.g., corrupted) during transmission, the transmitting endpoint is required to maintain a retransmission timer for previously transmitted but unacknowledged packets on each supported logical link. The expiration of the timer stimulates special recovery procedures, such as querying the receiver for the last good packet received, and in the end results in retransmitting the corrupted packet(s). The retransmission intervals are usually on the order of 100s of milliseconds, because processing latencies at the receiver, and, perhaps, also transmission latencies, may legitimately delay receiver acknowledgments. Because of the significant delay which may be incurred in receiving an acknowledgment to a transmitted packet, xe2x80x9cwindow mechanismsxe2x80x9d are employed to enable the transmitter to have some number of outstanding transmitted packets which have not yet been acknowledged.
xe2x80x9cBusy conditionsxe2x80x9d at the receiver are typically handled by explicit notification to a transmitting endpoint when the busy condition exists and when it is exited (which itself adds to the processing load on the busy receiver). This is necessary because using the normal retransmission intervals to recover from busy conditions would result in unacceptable delays in getting new packets to the receiver. Both the error recovery and busy recovery procedures described above must work in the presence of errors themselves (e.g., a receiver acknowledgment or indication that a busy condition has cleared could become corrupted).
Further, communications to recover from errors or busy conditions in one direction of communication between two endpoints, may be hindered by busy or overloaded conditions in the other direction of communication between those same two endpoints. Due to the complexity of implementing these protocols, and because of the need to maintain the per-link state information (e.g., retransmission timers and transmit windows) for a large number of links, these protocols are typically implemented in software. In a typical multi-tasking, pre-emptive operating system, the protocol processing for each error-free packet may involve two or more context switches, as well as the work to start and stop a software timer. When it is desired to handle large volumes of traffic, just the processor overhead of these operating system actions themselves can be significant. The software development of these complex protocols is also error prone, and the development cost significant. It is not unusual for products including them to suffer one or more hard-to-find xe2x80x9cbugsxe2x80x9d in the communication protocol after it has been deployed in a product.
Accordingly, there exists a need for an error-correcting communications protocol which improves communication system performance and efficiency in the transmission of packets between independent, inter-communicating endpoints. Further, there exists a need for a simpler error-correcting protocol, and one that can be cheaply implemented in hardware, enabling high performance yet cost-effective network communication systems.
The present invention provides an error-correcting communication protocol implemented within a network communication system connecting a plurality of endpoints, i.e., devices or units, e.g., servers and terminals, via at least one transmission medium. In the protocol, each transmitting endpoint separates its traffic, e.g., data packets, to be transmitted into distinct queues. The packet at the head of one queue is transmitted, and no other packet is transmitted until a transmit complete signal is generated. A transmission is termed xe2x80x9cincompletexe2x80x9d until the transmit complete signal is serviced. There is only one incomplete transmission from a given transmitting endpoint at any time.
When a packet is received without error and accepted by the receiving or destination endpoint, the receiving endpoint immediately returns an acknowledgment indicating successful reception to the transmitting endpoint. The transmit complete signal is generated at the transmitting endpoint at a time when any receiver acknowledgment should have been received by the transmitting endpoint.
When the transmit complete signal is processed by the transmitting endpoint (in hardware or in a software transmit complete interrupt service routine), if a receiver acknowledgment indicating successful reception has been received, the packet previously sent is removed from the head of its queue. Otherwise, the packet is left at the head of its queue, the queue is placed in a xe2x80x9cpending retryxe2x80x9d state, and a short, xe2x80x9cpending retryxe2x80x9d timer, preferably, a hardware timer, is started (unless the pending retry timer has been started because of a pending retry for another queue). In either event, another packet is transmitted from the head of a queue not in the pending retry state, if there is any.
When the pending retry timer expires, a hardware process or a software transmit complete interrupt service routine, moves all queues associated with the pending retry timer out of the pending retry state, enabling their packets to be transmitted again. Accordingly, the first packet to be transmitted from each of those queues will be retransmissions. An incrementing sequence number consisting of at least one bit is used for all transmissions from the same queue, and checked by the receiving endpoint, so that duplicate packets, e.g., packets which were erroneously retransmitted because of lost receiver acknowledgments, may be discarded by the receiving endpoint.