This invention relates generally to the efficient design and operation of communication networks and, more particularly, to scheduling mechanisms for use in stations connected to a communication network. A network consists of a communication channel, and a number of stations connected to the channel. Information is transmitted over the channel in data frames or packets, the packets having headers that permit each station to determine the destination of the data.
Each station has to perform a number of tasks and subtasks related to sending and receiving data packets. In general, these tasks and subtasks may overlap in time, and the performance of any one of them may be dependent on the completion of another. Moreover, the functions performed in a station, whether dependent or independent, typically have relative priorities. Since a station may comprise only one processor for performing all of the station tasks and subtasks, scheduling these tasks and subtasks at the station is a critical aspect of network operations. Scheduling in the end-system has to meet the requirements expected by the network: the network protocols, the rules for accessing the channel, and so forth, primarily in terms of throughput and latency.
As computer networks have developed, various approaches have been used in the choice of communication medium, network topology, message format, and protocols for channel access. Some of these approaches have emerged as standards, and a model for network architectures has been proposed and widely accepted. It is known as the International Standards Organization (ISO) Open Systems Interconnection (OSI) reference model. The OSI reference model is not itself a network architecture. Rather it specifies a hierarchy of protocol layers and defines the function of each layer in the network. Each layer in one computer of the network carries on a conversation with the corresponding layer in another computer with which communication is taking place, in accordance with a protocol defining the rules of this communication. In reality, information is transferred down from layer to layer in one computer, then through the channel medium and back up the successive layers of the other computer. However, for purposes of design of the various layers and understanding their functions, it is easier to consider each of the layers as communicating with its counterpart at the same level on the remote machine. However, of interest in the present invention are the data flowing from one layer to the next on the same machine, and the scheduling of activities of each of these layers in the same machine.
The lowest layer defined by the OSI model is called the physical layer, and is concerned with transmitting raw data bits over the communication channel, and making sure that the data bits are received without error. Design of the physical layer involves issues of electrical, mechanical or optical engineering, depending on the medium used for the communication channel. The layer next to the physical layer is called the data link layer. The main task of the data link layer is to transform the physical layer, which interfaces directly with the channel medium, into a communication link that provides communication services to the next layer above, known as the network layer. The data link layer performs such functions as structuring data into packets or frames, and attaching control information to the packets or frames, such as checksums for error detection, and packet numbers.
Although the data link layer is primarily independent of the nature of the physical transmission medium, certain aspects of the data link layer function are more dependent on the transmission medium. For this reason, the data link layer in some network architectures is divided into two sublayers: a logical link control sublayer, which performs all medium-independent functions of the data link layer, and a media access control (MAC) layer. This layer, or sublayer, determines which station should get access to the communication channel when there is competition for it. The functions of the MAC layer are more likely to be dependent on the nature of the transmission medium.
From this background, it will be appreciated that each station must perform a number of housekeeping tasks on a continuing basis to ensure efficient operation of the station and the network. Packets of data received from the network have to be promptly processed to avoid or minimize loss of received packets. Processing includes not only delivering the packets to a station memory, but transferring the packets from the lowest protocol layer through various higher layers. Packets of data for transmission onto the network have to be processed also, i.e. transferred through the protocol layers and eventually transmitted onto the network. Processing of data packets may also include examining and modifying packet headers. In addition, there may be a number of other types of functions that the station has to perform, depending on the activities for which the station is designed. For example, a station may be a file server for the network, handling mass storage devices and to which other stations have access through the file server. Or the station may be a terminal or workstation through which a user accesses the network. In summary, each station has to process receive packets and transmit packets, and may have to perform a number of other tasks.
Some of the station functions are typically performed in a station adapter, a device that provides an interface between the network communication channel and a host system bus of the station. The adapter contains a packet buffer memory in which receive packets are temporarily stored before transfer to a host system memory, also connected to the bus, and in which transmit packets are temporarily stored after retrieval from the host system memory and before being transmitted onto the network. The adapter may have its own processor, or may use a hardware state machine to handle some of its functions. Other station functions may be performed by the host processor itself. Regardless of where in the station the various processing functions are performed, there is almost always a problem of scheduling these functions in an appropriately efficient way. The problem is simply a matter of how to allocate processing resources fairly. If a station were to be provided with abundant processing resources, there would by no significant scheduling problem. Normally, however, there is a cost constraint that keeps processing resources small, and the scheduling problem usually reduces to a question of allocating a single processor to handle multiple tasks as efficiently as possible.
Typically, scheduling mechanisms in a network station processor or adapter processor are interrupt driven, which simply means that the processor is interrupted from time to time by events as they transpire. For example, when a receive packet must be read from the network communication channel, a hardware interrupt is generated and preempts whatever is currently taking place, in favor of the usually more important receive packet processing. An interrupted processor must save the context of its current processing task, determine the cause of the interrupt, switch to a new task, and later return to the interrupted task after restoring the context at the time of the interrupt. Scheduling is often more complex in a host processor, which, in addition to communication functions, has to attend to such activities as disk input/output, file service, and application protocol processing.
An interrupt driven scheduling mechanism has assigned interrupt priorities, which are usually fixed. In most systems, the highest priority is accorded to receiving packets at the lowest network protocol layer. Then, the network protocol layers at the next higher levels get priority to process the received packet. When these tasks give up control, the lowest layers of packet transmission activity get priority. Some of these packet reception and transmission tasks may be non-interruptable, which is to say they must run to some point of completion or partial completion before another task obtains control. Often, they run to completion, which involves processing multiple packets received in a burst.
There are four related problems that should be avoided or minimized in any network station scheduling mechanism: receive livelock, transmit starvation, unacceptable packet loss, and unacceptable latency for delivery of a first packet in long burst of received packets. Receive livelock is a condition that arises when a system is so preoccupied with fielding receive interrupts that it is unable to complete the process to deliver the packets to the application, which is the final recipient. Stated another way, when the processing and delivery of receive packets takes longer than the time between receipt of successive packets, then the processor can only complete processing packets at lower network layers, and does not make sufficient progress at the higher layer. Receive livelock usually results in the next mentioned problem, transmit starvation. A typical scheduling approach provides receive processing higher priority, allows receives to interrupt other activity, and allows receive processing to run to completion before starting another task. In a system employing this approach, progress in processing transmit packets is assured only when the packet reception process is complete. Thus, in a receive livelock condition, or even when the processor is only relatively busy processing receive packets, the tasks relating to transmits, and also to higher layer processing, will be starved of processing resources.
Interrupt driven, non-preemptable processing of receives at the lowest network layers also adds latency to delivery of packets to their ultimate destination, and can introduce excessive packet loss. Latency to deliver the packet to the application in interrupt driven systems arises because of the processing overhead needed to service an interrupt and to save the context of processing already in progress. In addition, because of priority for this interrupt, the higher layers do not make timely progress as they do not get a fair share of processing resources, and latency to deliver the packet further increases. Packet loss results when receive livelock is advanced to the state that no progress is being made in processing packets already received, and newly arriving packets have to be dropped for lack of buffers to queue these packets to the next higher layer.
It will be appreciated from the foregoing that there is still a significant need for improvement in scheduling mechanisms for communication functions of a network station operating system. Scheduling mechanisms are needed both in the host system processing unit and in the processor of the adapter that connects the host to the network. In particular, what is needed is a scheduling approach that addresses all of the problems discussed above. The present invention satisfies this need.