The present invention relates to the installation, operation, and maintenance of communication and data networks and networked computers at the circuit/mother board level. In particular, embodiments of the present disclosure are directed to devices, systems, and methods for distributed monitoring and control of server computers and other network devices, at the circuit/mother board level.
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
As cloud computing and other online and software services, software as a service (SaaS), become increasingly more popular and ever more ubiquitous, computing resource providers are faced with the challenges associated with ever-increasing size of server facilities, or so-called “server farms.” In addition to the space and power requirements required for operating hundreds, if not thousands, of networked computer servers, computing resource providers must also consider the maintenance, monitoring, and repair of such large deployments of server computers. To address the space and power distribution requirements of such large-scale computer server operations, many computing resource providers use rack mount server computer configurations 100, such as that shown in FIG. 1.
As shown in FIG. 1, rack mount form factor server computers 130 can be installed in a server rack 105. The use of server racks 105 to house and organize rack mount form factor server computers 130 addresses two initial challenges that face operators of large scale server farms. Firstly, by arranging the computer servers vertically, as shown in FIG. 1, the rack mount form factor server computers 130 installed in a rack 105 can reduce the physical footprint required for each server computer, thereby increasing the server computer density per square foot in a server farm facility. Secondly, many server computer racks 105 also include backplane power distribution sockets which allow for rack mount form factor server computers 130 to be physically inserted into the rack 105 and be electrically connected to a power supply simultaneously. Each server computer in the rack can then be connected to one another, and/or to other server computers external to the rack, to form a network of computers.
To create the necessary network connections, each server computer 130 is individually coupled via one or more networking cables to an appropriate network communications device, such as switch, bridge, or router 110, as shown in FIG. 1. Such high density rack mounted configurations 100, some of which can include multiple racks of rack mount form factor computers 130 in a single room, tend to generate large amounts of excess heat that must be managed appropriately to prevent overheating which, if left unchecked, can damage or destroy the server installation.
To manage the excess heat, various systems exist for cooling or otherwise extracting the heat from the server room in which the server racks of server computers are installed. For example, many server farms are equipped with powerful air conditioner systems to flood the server room with cooled and humidity controlled air 115. While effective, such systems usually do not have server-position based feedback to tell the cooling system where more cooling is required and where the cooling can be reduced. As such, conventional server farm cooling systems often run at temperatures lower than required at great expense to the server farm operator. The effectiveness and efficiency of such cooling systems are further reduced by the fact that most cooling system configurations direct the cold air 115 from the bottom end of the server rack 105. This results in an undesirable temperature gradient 120, with cooler temperatures achieved at the bottom 125 of the rack 105 and higher temperatures, due to the waste heat from the server computers 130, at the top 127.
In attempts to reduce the overall operating cost of the server room, some operators may choose to operate the cooling systems at higher temperatures at the risk of potentially overheating some or all of the server computers of the top 127 of rack 105, or another rack which may be operating at higher temperatures than rack 105 due to higher computing activity. Accordingly, a server farm operator using conventional cooling systems must balance the cost savings associated with running the cooling system at higher temperatures with the risk of possibly overheating one or more of the server computers 130 in one or more server racks.
Furthermore, the as the density of server computers 130 in a given server farm installation increases, the task of maintaining, troubling shooting, repairing, and auditing the installation of server computers grows increasingly arduous. Since the motivation to reduce costs typically deters server farm operators from hiring additional technicians, most server farm operators seek to increase the level of automation with which they can efficiently and effectively operate. The ability to detect, locate, diagnosis, troubleshoot, as well as indicate to technicians, server computers that are experiencing unfavorable environmental conditions, such as overheating, or have suffered a software, firmware, or hardware malfunction, can greatly increase the efficiency and effectiveness of each technician tasked with the operating the server farm. Increasing the effectiveness and efficiency of each technician can translate into not only cost savings for the server farm operator, but also increased performance and revenue. While some systems exist for accomplishing such monitoring and maintenance, existing solutions include dedicated computing devices and additional monitoring structures which increase the cost of acquisition and installation, thus negating much of the cost savings that a server farm operator might achieve in the resulting reduced operational costs.
Thus, there is a need for improved systems, devices, and methods for distributed monitoring and control of networked server computers. The present invention solves these and other problems by providing network cables with distributed sensors that can deliver information and control signals amongst a distributed installation of existing server computers and switches using existing communication channels and protocols.