Packet-switching offers the promise of greater efficiency in the utilization of a common communications network of lines and switches by dividing user messages (or data blocks) into short, self-addressed packets, transporting the packets over the network, and re-assembling them at each of their various destinations. But, despite great advances in computing and telecommunications in recent years, end-to-end (device-to-device) communication rates across packet-switched networks have improved little.
The primary reason for this impasse is the software (and thus processor) burden imposed on node processors by the programmed interrupt-mode input/output structures involved in packet handling. The functions involved include: packeting the message and appending the correct control and address information; putting the packet onto each successive link enroute to its destination without collision with other packets (or, if there is collision, recovering from it; checking and re-routing it at each switch; checking it for validity and sequence, and re-ordering it if necessary (at least at the destination if not at every switch); acknowledging each correct packet and requesting missing, corrupted or out-of-sequence packets to be re-transmitted; retransmitting missing or corrupted packets and fitting them into place: and depacketing the data to re-create the original message. Not only does this slow end-to-end communication, but it means that node processor capacity is not available to provide advanced user-oriented facilities such as protocol conversion for different device, multi-session windowing, encryption and the provision of enhancements to `dumb` computer terminals and digital phones.
The problem is compounded if any link enroute, or the destination node, is temporarily congested and packets must be discarded or queued for transmission sometime later, but queuing adds the burden of queue management to that of packet management for nodal processors. Voice data is inherently burstly and intolerant of delay but can accommodate the random loss of a small percentage of packets. Computer data is tolerant of delay, but the loss of even one packet may involve the retransmissions of many others, thereby exacerbating any congestion problem. Other data, such as realtime controls information, may well be unable to tolerate either appreciable delay or packet loss. Thus, packets cannot be discarded indiscriminantly in integrated systems.
Different strategies for addressing these problems in computer communications exist. Very large packets can be permitted (as in SNA) to reduce the cumulative effect of per-packet processing, but in so far as such packets are used, the advantages of packet-switching are lost. The burden of error checking at the destination and end-to-end re-transmission can be reduced by the use of point-point/store-and-forward protocols (as in ARPARNET and SNA), but overall processor involvement is greatly increased thereby, particularly if all packets of a block or message are re-sequenced at each intermediate point. Alternatively, the end-point processor can be made to do almost all the work in a simple end-to-end data gram service (as in DECNET) in which packets need not be delivered in sequence, may be discarded (eg. for congestion control), may be duplicated or may loop within the network. None of these approaches are suited to the handling of voice packets and, with the exception of SNA, are not concerned with Session Layer communication. All allow long and variable-length packets, thereby suffering high latency.
As TDM is commonly used for the multiplexing of digital voice channels, and as it offers shorter packet delays, collision-free access and the preservation of sequence, its application to integrated wide-area packet-switched networks has been proposed, despite the attendant equipment costs. However, as shown by the comprehensive TDM system disclosed in U.S. Pat. No. 3,749,845 to Bell Laboratories, very substantial processor burdens at nodes and switches are involved, even though end-to-end Transport and Session Layer protocols were not addressed. U.S. Pat. Nos. 3,979,733 to the same assignee sought to reduce this apparently impractical processing burden by the use of a hardware-implementable technique for buffering and re-addressing packets as they are taken off one TDM trunk and put onto another. But that only addressed a relatively minor part of the problem. With similar effect, U.S. Pat. No. 4,491,945 to the same assignee, disclosed hardware-based Banyan-type packet switches and a scheme for rotating address bit as packets transit the switches.
Sequencing of packets, particularly computer data packets, is necessary in packet switched systems where successive packets may be routed differently or variably buffered. Various protocols are employed to properly sequence packets in computer communications are reviewed in the Tanenbaum reference. A short sequence number field in the packet is used together with a `sliding-window` at the receiving node to identify the next packet expected. In the simplest protocol, any out-of-sequence packet and all subsequent packets in a block (or file) transfer are re-transmitted, leading to significant delays and lost bandwidth. In the more complex protocol, a buffer is set aside for each packet of a block transfer at the destination node and each packet is placed into the appropriate buffer as it arrives. As long as there are no earlier gaps, the node processor can either commence to assemble the packets into a contiguous block for transfer to the appropriate session, or read the packets from the buffers in correct sequence to a host processor. If there are gaps, the missing packets can be identified for re-transmission. The latter method is impractical where packets differ widely in length, the block size is large, or blocks may comprise large numbers of packets. Moreover, it requires more buffer space, double-handling of packets and an even greater demand on software processes at the destination.
In all prior packet switching systems for handling computer data known to the applicant, the destination node or host processor is interrupted every time a correct packet is received in sequence in order to determine the length of its data segment, to allocate the correct memory location for the data and to transfer the data to that location, so that the original message will finally be assembled. While the use of hardware-mediated DMA (direct memory access techniques) for transferring data to and from memory locations substantially without processor involvement is well known in computer design (see the Shiva Reference), similar techniques have not been used or proposed (to the applicant's knowledge) for the transfer of data from memory to memory across a network. The necessary bus, data, address and control lines for DMA are not available in a network and the problem of out of sequence bytes does not arise in DMA transfers within computers. Nevertheless, if similar hardware-based techniques could be used for data transfer across a network, substantial advantage would be gained.
Asynchronous, collision-free, network access under fully distributed control is recognised as being essential for wide area packet-switched networks, and known ring-based systems (such as the register insertion loops and rings, slotted-rings and token passing rings reviewed in the Trooper and Tanenbaum References) offer these features. In general, however, ring-based systems are regarded as being inherently unsuited for use in WANs because of their ring structure. Nevertheless, register insertion loops are of particular interest because they provide some degree of inherent packet storage or buffering and can have at packets in transit at the same time between different pairs of nodes on the ring.
Register, buffer or delay insertion secures distributed and contention-free media access by the simple expedient of delaying any incoming packet in a register or buffer (herein called the hold FIFO (first-in, first-out register) while an outgoing packet is being placed on the loop. The Tropper and Tanenbaum References review non-contention loop systems, including register insertion, and note the various ways in which the inherent latency in the transfer of data around a register insertion loop may be minimised. That problem is addressed in greater depth in the other References.
Register insertion switches intended for substantial implementation in hardware and voice/data packet communications in LANs are disclosed in U.S. Pat. No. 4,500,987 to NEC and No. 4,168,400 to CETT. Each assigns a higher priority to voice packets (identified in a type field) and allocates each type of packet to separate first-in-first-out (FIFO) queues at each switch or node. Control logic, is used to select highest priority packets for transmission in gaps between packets on the asynchronous loop, or in place of lower priority packets on the loop. The switching nodes are distributed geographically around the media of a serial loop.
The NEC patent effectively allocates bandwidth on demand for voice connections by reserving a circulating packet `space` to effect duplex communication. But this carries the penalty that many of the reserved packets will be empty, thereby foregoing a major advantage of packet switching in integrated systems (the ability to fill voice gaps with data packets). To maintain loop synchronization fixed-length packets are used and the loop transmission delay is dynamically adjusted to an integer multiple of the length the packets. The CETT patent argues advantage in being able to use the more efficient variable-length packets and discloses a method for inserting them on a loop in place of corrupted packets. Neither patent discloses methods suited to the end-to-end handling of data packets at the Network, Transport or Session Layers.
The simple acknowledgement protocol used in such ring-based systems (see the Bridges Reference for an example) is that the destination copies each packet which it can receive and sets an acknowledgement (ACK) flag in the original packet which then continues around the ring to the source where it is removed. If the destination node is busy, it cannot copy the packet and does not set the ACK flag. The source then has the option of removing the packet and retrying later or allowing the packet to circulate a few more times before removing it. This protocol not only ties up the source when a destination node is busy, but also makes it impractical to handle broadcasts, especially where a few of the addressed nodes are busy. Furthermore, it is impractical in multi-loop systems (necessary in WANs) and it is largely for this reason such loop-based systems tend to be confined to single-loop LANs.
It should be noted that the terms loop and rings are used synonymously in this specification, though `loop` is often used for systems in which the whole packet is received before relay and `ring` is often used for systems in which the bits of a packet stream through each station. It should also be noted that reference to a loop in this context does not exclude a dual loop, one for each direction, shared by all switch elements.