Ever since the introduction of the microprocessor, computer systems have been getting faster and faster. In approximate accordance with Moore's law (based on Intel® Corporation co-founder Gordon Moore's 1965 publication predicting the number of transistors on integrated circuits to double every two years), the speed increase has shot upward at a fairly even rate for nearly three decades. At the same time, the size of both memory and non-volatile storage has also steadily increased, such that many of today's personal computers are more powerful than supercomputers from just 10-15 years ago. In addition, the speed of network communications has likewise seen astronomical increases.
Increases in processor speeds, memory, storage, and network bandwidth technologies have resulted in the build-out and deployment of networks with ever increasing capacities. More recently, the introduction of cloud-based services, such as those provided by Amazon (e.g., Amazon Elastic Compute Cloud (EC2) and Simple Storage Service (S3)) and Microsoft (e.g., Azure and Office 365) has resulted in additional network build-out for public network infrastructure, in addition to the deployment of massive data centers to support these services that employ private network infrastructure.
Cloud-based services are typically facilitated by a large number of interconnected high-speed servers, with host facilities commonly referred to as server “farms” or data centers. These server farms and data centers typically comprise a large-to-massive array of rack and/or blade servers housed in specially-designed facilities. Many of the larger cloud-based services are hosted via multiple data centers that are distributed across a geographical area, or even globally. For example, Microsoft Azure has multiple very large data centers in each of the United States, Europe, and Asia. Amazon employs co-located and separate data centers for hosting its EC2 and AWS services, including over a dozen AWS data centers in the US alone. Typically, data is replicated across geographically disperse data centers to ensure full service availability in case all or a portion of a data center goes down in view of power failure/availability events (e.g., blackouts and brownouts), weather events and other natural disasters, network availability issues (e.g., cutting or otherwise unavailability of high-capacity optical cables), and for other reasons.
Of significant importance are power consumption and cooling considerations. Faster processors generally consume more power, and when such processors are closely packed in high-density server deployments, overall performance is often limited due to cooling requirements. Not only due the processors and other components in the servers consume an incredible amount of power, significant additional power levels are consumed for cooling purposes. As a result, one of the largest operating costs for data centers is power. While much improvement has been made in the form of lower power-consuming silicon, better cooling management, and smart power supplies, hardware vendors are quickly hitting a wall for reducing energy costs.
Another aspect of data centers is scalability. As workloads increase and decrease, servers are bought “on-line” and taken “off-line,” wherein an on-line server is available to service work requests while off-line servers are unavailable to service work requests. Rather than shutting off-line servers completely down, these servers are typically put in a reduced power state under which the server processors (the main power consumers) are put into a “sleep” or “sleeping” state (noting that some processors support multiple levels of reduced power states).
In recent years, network adapters and interfaces have been introduced that also support reduced power states, such as some Ethernet adaptors and InfiniBand (IB) Host Channel Adapters (HCAs). However, there are currently no mechanisms for reducing power states in InfiniBand switches, whether by individual port or across an entire IB switch.