Enterprises have grown increasingly reliant on computing systems to accomplish mission-critical tasks. Indeed, enterprises expend vast amounts of resources dedicated to deploying and supporting servers and other processing systems, as availability of sufficient computing resources is critical to enterprise performance. With increasing complexity and number of applications typically deployed in business environments today, providing sufficient computing resources to meet business needs in a cost effective manner poses many challenges.
The workload on a typical server implementation can vary dramatically throughout the day. In fact, demand for one or more applications, and therefore the processing and I/O resources requirements to effectively support these applications, can shift and vary considerably throughout a typical business day. Many application server implementations usually involve building out the infrastructure either to peak loads or to average loads. Building to peak loads can mean a massive over-commitment of resources. This over-commitment can be fairly large when the management and operational costs of the additional servers is included. Building out the infrastructure to the average load is a common approach for many application implementations, but can have even larger productivity costs when demand spikes. For example, in the case of email application servers, slow email access and error messages are frustrating for users and, at times, can lead to confusing email conversations as messages are delivered out of order. In extreme cases, where disk I/O has become too slow to access the database, the mail application server services can fail, disrupting operations all across the enterprise.
In either type of application server deployment implementation, a server application runs on a host that manages its own local or remote storage, and often network access as well. The size of the host server along with its storage and network capacity bounds the number of users this host can support. In the case of email applications, for example, large installations will break up the email user community into blocks of users that can be managed and hosted on each separate server. For example, email server A might service users with last names beginning with letters A through L, and email server B will host users whose last names begin with letters M through Z. Scaling such an architecture up (or down) requires physically adding (or removing) hardware resources. This approach, however, is generally time consuming and costly, and fails to address the variation in demand across different applications.
Systems have been developed to provide for virtualized access to input/output (I/O) subsystems. Some systems include virtual I/O server systems that allow multiple stand-alone application servers or virtual servers to share one or more I/O subsystems, such as host-bus adapters and network interfaces. I/O access is managed by one or more virtual I/O servers to which the application servers are connected over a network, such as a switched or routed network. To provide network QoS to the application servers, a network administrator might use software/hardware to shape the network's traffic.
A variety of network QoS mechanisms exist. For example, the token bucket is an algorithm for network traffic shaping or rate limiting. Typically, the token bucket is used to control the amount of data that is injected into a network, allowing for “bursts” of data to be sent. Conceptually, the traffic shaper employs a token bucket which contains tokens, each of which might represent a unit of bytes. A set of configurable parameters defines to the traffic shaper how many tokens are needed to transmit how many bytes and a capacity for the token bucket, say b tokens. Then in some embodiments (e.g., those that transmit packets), the filter proceeds as follows: (1) a token is added to the bucket every 1/r seconds for some constant rate r; (2) since the bucket can hold at most b tokens, if a token arrives when the bucket is full, the token is discarded; (3) when a packet of n bytes arrives, N (N is proportional to n as defined by configurable parameter) tokens are removed from the bucket, and the packet is sent to the network; and (4) if fewer than n tokens are available, no tokens are removed from the bucket, and the packet is considered to be non-conformant.
The token bucket algorithm allows bursts of up to b bytes, but over the long run the output of conformant packets is limited to the constant rate, r. A non-conformant packet might be treated in one of the following ways: (a) it might be dropped; (b) it might be enqueued for subsequent transmission when sufficient tokens have accumulated in the bucket; or (c) it might be transmitted, but marked as being non-conformant, possibly to be dropped subsequently if the network is overloaded.
Hierarchical token bucket (HTB) is a variation on the token bucket algorithm. HTB was implemented by Martin Devera as part of the Linux kernel; the Linux man page for HTB is tc-htb(8). As its name suggests, HTB involves a number of token bucket filters arranged in a hierarchy. Devera's implementation apparently builds upon the three-color token bucket filter described in RFC 2697, A Single Rate Three Color Marker, (September 1999), published by the Internet Engineering Task Force (IETF).
A variety of networking and fabric interconnection technologies exist. InfiniBand®, for example, is a switched fabric communications link primarily used in high-performance computing. InfiniBand is designed to be scalable and its features include quality of service (QoS) and failover. The InfiniBand architecture specification defines a connection between processor or application nodes and high performance I/O nodes such as storage devices. Ethernet is another example network technology used, for instance, in Local Area Networks (LANs). Ethernet stations communicate by sending each other data packets, small blocks of data that are individually sent and delivered. Each Ethernet station is assigned a single 48-bit MAC address, which is used both to specify the destination and the source of each data packet. Network interface controllers (NICs), except when running in a promiscuous mode, normally do not accept packets addressed to other Ethernet stations.