1. Field of the Invention
Embodiments of the present invention relate generally to multiprocessor systems and more specifically to dynamic control of scaling in computing devices.
2. Description of the Related Art
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Multiprocessor systems have become increasingly common to handle the ever increasing amount of network traffic. To efficiently utilize the processing capabilities of the multiple processors, the operating systems for these systems need to intelligently distribute the work load. One approach is the use of receive-side scaling (“RSS”) technology, which enables packet receive-processing to scale with the number of available processors. To illustrate, FIG. 1A is a simplified block diagram of a computing device, 100, in which the RSS technology can be implemented. Specifically, computing device 100 includes multiple processing units 111, such as processing units 1, 2, and n. The n processing units share network adapter 104 via high speed I/O bridge 102 and execute certain interrupt service routines (“ISRs”) and deferred procedure calls (“DPCs”) to support the RSS technology. The operating system of computing device 100 schedules the execution of the ISRs and the DPCs and provides a network protocol stack and a network driver interface for network adapter 104. Some other aspects of the RSS technology are performed by network adapter 104.
FIG. 1B is a simplified flow diagram illustrating the RSS technology mainly from the perspective of network adapter 104. Network adapter 104 receives packets from the network in step 150. Before issuing any interrupt, network adapter 104 computes a signature for each packet using a hash function and transfers a receiver descriptor and packet contents to system memory 106 in step 152. A receiver descriptor generally includes the signature and also the information facilitating the communication between network adapter 104 and the operating system. It should also be noted that the aforementioned signature is used for load balancing across the multiple processing units in computing device 100. Then, in step 154, network adapter 104 issues the interrupt to the operating system, causing the ISR to execute. The ISR then schedules the DPC to execute on a designated processing unit based on the signature of the packets received by network adapter 104. Generally, the same default processing unit executes all the ISRs. The ISR instructs network adapter 104 to disable interrupts in step 156; in other words, even if network adapter 104 receives additional packets, it does not issue interrupts. However, to address the potential lag between the ISR execution and the DPC execution, the receiver descriptors associated with the received packets generally are queued. If the DPC completes processing of the receiver descriptors that have been queued, or if a certain period of time has lapsed in step 158, then the DPC reenables the interrupts for network adapter 104 in step 160.
The deployment of the RSS technology involves a certain amount of overhead, such as the aforementioned signature generation and the processing of the ISRs and the DPCs, to enable load balancing across the different processing units. The cost of this overhead can be justified in two scenarios: 1) when there is considerable amount of packet processing work to be shared among the multiple processing units; and 2) when at least one processing unit is being over-utilized. In other words, if the traffic on the network is light or if all the processing units in computing device 100 are underutilized, then the benefits of load balancing offered by the RSS technology are reduced such that they do not outweigh the cost of the associated overhead. There, in low traffic situations, automatically implementing RSS technology negatively impacts the overall performance of computing device 100.
As the foregoing illustrates, what is needed in the art is a technique for dynamically controlling of scaling in computing devices to optimize the overall performance of these systems.