In the last few years, queue management systems have been proposed for distributing incoming and outgoing traffic to and from a host through a network interface card (NIC) with multiple queues. FIG. 1 illustrates one such system. Specifically, it illustrates (1) multiple virtual machines (VMs) 102 that execute on a host computer (not shown), and (2) a NIC 100 that has multiple queues. As shown in this figure, each queue has a receive side set 104 of buffers and a transmit side set 106 of buffers to handle respectively incoming and outgoing traffic. The system has four types of queues, which are: a default queue 105, several non-default queues 115, LRO (large receive offload) queues 120 and RSS (receive side scaling) queues 125. The latter two types of queues are specialty queues tied to specific hardware LRO and RSS functionalities supported by the NIC.
The queue management system of FIG. 1 distributes traffic to and from the virtual machines (VMs) across multiple queues. In this system, all VMs start out in a default queue 105. A VM is moved out from the default queue to a non-default queue 115 whenever its traffic exceeds a given threshold. When moving a VM out from the default queue, this implementation always moves a VM to the least-loaded non-default queue regardless of the requirements of the VM. This causes three major problems.
First, since the current implementation chooses a non-default queue without considering the VM's traffic type, VMs with special requirements might be interfered by other VMs. For example, if a special VM that transmits and receives latency-sensitive traffic, shares the same queue with several other VMs running less latency-sensitive, throughput-intensive workloads, the latency and jitter of the special VM will certainly be affected. Queue 150 in FIG. 1 is an example of an overloaded queue that has traffic for both a low latency required (LLR) VM 152 and several high latency tolerating (HLT) VMs. In this situation, the LLR VM 152 might not be able to send and receive traffic within the maximum latency that it can tolerate because of the traffic of the various HLT VMs.
The second problem with this implementation is that it statically assigns fixed number of queues to one of the three different non-default pools of queues, which are non-default queues 115, LRO (large receive offload) queues 120 and RSS (receive side scaling) queues 125. In this approach, each pool has all of its queues assigned and allocated during the driver initialization. By default, each pool will get the same amount of queues, even if the pool is in fact not in use. This results in a performance issue when a pool needs more queues to sustain the traffic as the overloaded pool will never be able to take over free queues from other pools and thus can never grow further, even if the system has the capacity.
The third problem is that queue assignment for a VM is one-time, i.e., once the VM moves to a queue, it will never be moved to another non-default queue. This causes two issues. First, because the assignment is one-time, if a VM later needs more resources to grow the traffic, it might end up being limited by the utilization of its current queue. Even if there is a less-busy queue that has more room to grow, this prior approach does not allow the VM to take the chance. In addition, this approach tries to statically keep all queues busy, even if not so many queues are needed to serve the traffic. Since this approach has a dedicated kernel context for each queue, having unnecessary number of active queues results in more active contexts. These active contexts will inevitably halt other contexts (such as vCPU) when an interrupt arrives. Therefore, the host ends up spending more cycles doing context switches, which hurts VM consolidation ratio.