Interrupt coalescing is a technique that is used to reduce processing unit overhead by batching requests and using single interrupt for each batch. This technique effectively limits the interrupt rate and, therefore, the overhead of interrupt processing. While the current approaches generally reduce processing unit overhead, they do not handle certain corner cases well, leading to sub-optimal application performance and inefficient system-wide resource utilization.
For example, one approach to coalescing sets a static value for the size of each batch of requests (referred to herein as “depth” or “coalescing depth”). The rate at which interrupts are generated (referred to herein as “coalescing rate”) is then determined by the quotient of the rate at which requests are received and the static depth value. When the system is lightly loaded, the coalescing rate will be low and requests may be delayed as the system waits for additional requests to meet the static depth. Such a delay is undesirable if a latency-sensitive workload is running. On the other hand, if the static depth is set to a smaller value and the system is heavily loaded, the processing unit will suffer from the overhead of the higher interrupt rate.
In another approach, the system determines a variable depth based upon a quotient of the actual number of requests and a static coalescing rate. As a result, a high number of requests results in a greater depth and a low number of requests results in a smaller depth. Under this approach, however, the system suffers a similar problem to the static depth approach described above. A lightly loaded system running a latency-sensitive workload may still need a higher interrupt rate than provided by the fixed coalescing rate. Conversely, the fixed coalescing rate may result in the processing unit still being inefficiently burdened by the overhead of the interrupts when the system is heavily loaded.
The implications of interrupt coalescing are further complicated in a virtual computing environment. For example, a host computer may run as many as one thousand virtual machines, each of which is generating interrupt requests. If each virtual machine is generating thousands of requests a second, the virtualization software (upon which all of the virtual machines run and rely for access to physical resources) receives millions of requests per second. As a result of the magnitude of requests, the virtual computing environment is likely to experience a greater variance in the load on the system and, therefore, the aforementioned corner cases.