1. Field of the Invention
This invention relates to servicing queues of communication requests in computing systems. For example, system area network systems may support hundreds or thousands of I/O channels which share a network fabric or other communication link. This invention relates to how the bandwidth of the fabric or link may be shared amongst the communication channels.
2. Description of the Related Art
Computer systems I/O performance has become critical in today's applications. Applications such as web servers, internet based e-commerce, online transaction processing, mail and messaging, and other internet, intranet and extranet applications are extremely I/O intensive. A coinputer's processor may need to frequently communicate with storage systems, network cards, and even other processors over limited I/O connections. The I/O portion of computer systems is rapidly becoming a bottleneck on performance. Recently, new I/O architectures have been proposed to improve upon the I/O bottleneck problem. Some of these proposals include system area network solutions employing numerous I/O channels sharing a switched interconnect fabric. Some recent I/O architecture proposals include Future I/O and NGIO which have merged into System I/O which is now controlled by the InfiniBand Trade Association.
A common theme in modern I/O architectures is the need to service many communication requests to many different communication channels that are sharing a limited total bandwidth. One problem is how to divide or allocate the available bandwidth amongst the various communication requests. Turning now to FIG. 1, a computing system is shown in which numerous communication channels are supported. A typical computing system may include multiple CPU's 102 coupled to a host bus 103. A system memory 104 and memory controller 106 may also be coupled to host bus 103 and shared by CPU's 102. Alternatively, or additionally, each CPU 102 may have its own local memory and memory controller. A host channel adapter 108 couples the host bus elements to various I/O resources. Host channel adapter 108 may be integrated in a single component with memory controller 106. A switch 110 may be employed to select amongst various I/O channels to connect to I/O devices 112. Switch 110 may include multiple switches arranged in parallel or in a hierarchical manner. Each I/O device or subsystem 112 may include a target channel adapter for interfacing to the I/O fabric 114.
The host channel adapter 108 serves to decouple CPU's 102 from I/O communications. Various applications executing on CPU's 102 may make I/O or communication requests to host channel adapter 108 for various I/O resources. Host channel adapter 108 services communication requests and returns the results to the requestor. Requests to the same channel may be placed in a queue awaiting service. Host channel adapter 108 services the communication requests within the constraints of the limited bandwidth available through switch 110 and I/O fabric 114. FIG. 1 is merely an illustration of one type of system employing multiple communication channels sharing a limited bandwidth. The problem of providing service for numerous communication channels can be found in many different architectures and computing applications.
Turning now to FIG. 2, a more conceptual illustration of queuing communication requests for different communication channels is provided. A system may be configured to provide for up to a maximum number of communication request queues 122. For example, a system may support up to 64K queues. Each queue establishes a sequence of operations to be performed on a particular communication channel. At any given time each queue may or may not have operations to execute. The servicing of the communication queues 122 may be performed by a host or local channel adapter 124. The local channel adapter 124 decides which request queue 122 is serviced over its communication channel through the communication fabric 126 to the appropriate remote channel adapter and communication device 128. Since the bandwidth of the communication fabric 126 is limited, the local channel adapter 124 must decide how best to distribute the limited bandwidth amongst the communication request queues 122. It is typically desirable that all queues be serviced in a fair fashion. Fairness may mean equal access to a given amount of bandwidth based upon a particular service class. Different service classes may be allocated different amounts of total bandwidth.
When a number of communication channels share bandwidth of physical media, as illustrated in FIG. 2, each channel may be viewed as a queue with traffic/packets to be sent over the media or fabric. Typically the number of channels greatly exceeds the number of physical media paths that are shared by the physical devices to which communication requests are being made. In some systems it may be desirable that each channel should get equal access to physical media bandwidth and that bandwidth should be equally shared amongst all the channels.
On solution to allocating bandwidth amongst the channels is a round robin, one message per channel allocation. However, a message based allocation may unfairly favor requesters with larger messages. For example, those channels with requests for large amounts of data would get more bandwidth than those channels with requests for smaller amounts of data.
Therefore, it would be desirable to have a more fair bandwidth allocation than a pure one message per channel mechanism. It may also be advantageous to group classes of channels together in order to vie for gross amounts or percentages of media bandwidth. This combination of channel grouping into classes and bandwidth allocation based on class may provide for differing levels of service.
Another problem that must be addressed when servicing multiple communication channels is how to keep track of which channels need serviced and which channel should be serviced next. One solution is to provide a block of memory in which a single bit maps to a channel. If the bit for a particular channel is set then it indicates that the corresponding channel has a pending request. After a channel is serviced, this block of memory may be examined to determine the next channel that has a pending request. This solution requires an amount of memory equal to the maximum number of channels supported. For example, if 64K channels are supported then 64K bits of memory are needed to indicated the status of each channel. One drawback of this solution is that much time must be spent scanning the memory for the next channel that has a pending request. For example, if the memory can be searched 32 bits at a time then the worse case search time would be 2K memory accesses (64K bits divided by 32 bits).
Alternatively, linked list structures may be used to indicate the next channel requiring service. The linked list solution avoids the search time inefficiencies of the previously described solution since the next channel to be service may be immediately indicated by the linked list structure. However, a linked list solution requires much more memory to implement. The first described solution in which one bit per channel is used to indicate whether or not a request is pending in a channel requires in the order of N bits of memory for N channels. However, a linked list solution requires memory in the order of N*logN. For example, for a 64K channel system a linked list solution would require on the order of 128 kilobytes of memory to maintain a list of active queues if the list was not sparse (approximately 16 times as much memory as the one bit per channel solution).
Thus, while a linked list structure may be time efficient for determining the next queue to be serviced, it is not memory efficient. In contrast, the one bit per channel solution may be somewhat more memory efficient, but may not be time efficient for determining the next queue to be serviced. Thus, it would be desirable to have a solution for determining which queue is to be serviced that is efficient in both time and memory.
Thus, servicing multiple communication channels evokes numerous problems. It is desirable to select the next channel or queue for servicing in a timely fashion. It may also be desirable to select the next channel or queue for servicing in a fair manner while providing for different service classes. It may also be desirable to perform the channel/queue selection using a minimal or reasonable amount of resources (e.g., memory).