Networks serve the purpose of connecting many different personal computers (PCS), workstations, or terminals to each other, and to one or more host computers, printers, file servers etc., so that expensive computing assets, programs, files and other data may be shared among many users.
In a network utilizing a client/server architecture, the client (personal computer or workstation) is the requesting machine and the server is the supplying machine, both of which may preferably be connected via the network, such as a local area network (LAN), wide area network (WAN) or metropolitan area network (MAN). This is in contrast to early network systems that utilized a mainframe with dedicated terminals.
In a client/server network, the client typically contains a user interface and may perform some or all of the application processing and, as mentioned above, can include personal computers or workstations. The server in a client/server network can be a high-speed microcomputer or minicomputer and in the case of a high-end server can include multiple processors and mass data storage such as multiple CD-ROM drives and multiple hard drives, preferably with redundant array of inexpensive disk (RAID) protection. An exemplary server such as a database server maintains the databases and processes requests from the client to extract data from or update the database. An application server provides additional business processing for the clients. The network operating system (NOS) together with the database management system (DBMS) and transaction monitor (TP monitor) are responsible for the integrity and security of the server.
Client/server networks are widely used throughout many different industries and business organizations, especially where mission-critical applications requiring high performance are routinely launched. The mass storage and multi-processing capabilities provided by current client/server network systems (for example, the high-end servers) that run such applications permit a wide range of essential services and functions to be provided through their use.
As can be appreciated, many businesses are highly dependent upon the availability of their client/server network systems to permit essential network services and functions to be carried out. As client/server network systems become increasingly essential to the everyday operations of such businesses, additional steps need to been taken in the design and construction of the server in the client/server network system to ensure its continuous availability to the clients. That is to say, in the design and construction of a server, steps need to be taken to ensure that the server can be operated with little or no downtime.
It can be appreciated by those skilled in the art that high availability, reliability and serviceability are valuable design aspects in ensuring that a server is a "zero downtime" system that will operate with little or no downtime. The modularity of components within a server has been recognized as an important design consideration in ensuring that the downtime of a server will be minimized. Modules can be removed and examined for operability or other purposes much easier than permanently mounted fixtures within a server chassis. When various components of a server can be provided in a modular form, they can also be readily replaced to maintain the operational status of the server with minimal downtime.
Removable modular components may include disc drives and power supplies. As described above, the removability of modular components allows for better overall serviceability of the computer system which is a distinct advantage. For example, a defective power supply in the server generally requires prompt replacement in order to limit downtime. Modular components and connectors facilitate prompt replacement and are thus popular in many computer designs.
Originally, a rule of practice in the maintenance of modular components or printed circuit boards of a server was that of turning the power to the server off before any modular components or printed circuit boards were removed from or added to the chassis or support frame of the server. Recent innovations have centered around a highly desirable design goal of "hot-pluggability" which addresses the benefits derived from inserting and removing modular components and printed cards from the chassis of the server when the server is electrically connected and operational. It can be readily appreciated that modularization and hot-pluggability can have a significant bearing on the high availability aspect of a high-end server.
Hot-pluggable components may include storage or disc drives, drive cages, fans, power supplies, system I/O boards, control boards, processor boards, and other sub-assemblies. The ability to remove these constituent components without having to power down the server allows for better overall serviceability of the system, which is a distinct advantage to both the user and the maintenance technician.
Component redundancy has also been recognized as an important design consideration in ensuring that a server will operate with little or no downtime. Essentially, component redundancy is typically provided in a system to better ensure that at least one of the redundant components is operable, thereby minimizing the system down time. With component redundancy, at least two components are provided that can perform the same function, such that if one of the components becomes faulty for some reason, the operation fails over to the redundant component. When at least one of the redundant components is operable, continued operation of the computer system is possible even if others of the redundant components fail. To further enhance reliability and serviceability, redundant components have been made hot pluggable.
Dynamic reconfiguration of a server system can also be accomplished by providing upgradable modular components therein. As can be readily appreciated, this objective can be accomplished by the addition or substitution of components having different circuits, preferably updated or upgraded, disposed there within. When components are redundant and hot pluggable, reconfiguration of the server is often possible without taking the server offline.
Another important design aspect with respect to providing redundant and hot pluggable components in a server system is to ensure and maintain a safe working environment while the server is operating and being repaired or upgraded. Accordingly, when the system components are swapped or upgraded, the exposure of hot connectors and contacts must be kept to a minimum. It can be appreciated by those skilled in the art that further developments in this area would significantly enhance the reliability and serviceability aspects of a high-end server system.
To further enhance the serviceability of server systems, additional innovations may be required in the design and construction of diagnostic sub-systems thereof. In existing client/server network systems it is often difficult to obtain, in a timely manner, important diagnostic data and information corresponding to a component failure in order to facilitate the quick serviceability of the server. Therefore, it can be appreciated that the more information that can be readily provided to locate a defective component or problem with the server, the better the optimization of the amount of time the server is up and running.
Although the cooling of computer systems has always been a concern with computer designers, the form factor of the chassis, "hot" pluggable components, and the high demands for improved reliability of the client/server network systems (with ever-increasing microprocessor power dissipation and system power consumption) have created additional problems with cooling system design, especially in temperature monitoring and temperature control. Not only are the high end servers utilizing the newer high powered processors, but they are also utilizing multiple processors, thereby creating even more heat within the system.
Most often, microprocessors and associated electrical components are cooled by airflow. Fans are used to push or pull air from one side of a chassis holding the electrical components, across the electrical components and out the other side of the chassis. By forcing air to flow over the electrical components, heat is dissipated thereby preventing the electrical components from overheating and failing.
The ability to cool electrical components with air is restricted by the ability to channel or direct the airflow through the chassis and across the electrical components housed therein. Air follows the path of least resistance, and in many cases, the path of least resistance does not cross the electrical components that need to be cooled. Accordingly, large volumes of air may be pulled through a chassis without ever cooling certain ones of the electrical components contained inside. The end result of this scenario being that the electrical components overheat and the computer system fails.
To direct the airflow through the chassis, existing systems include airflow barriers placed throughout the chassis. These airflow barriers, however, are generally designed around certain configurations of electrical components within the chassis, i.e., certain electrical components serve as airflow barriers. These electrical component configurations are often altered when particular components are removed or added. By removing or adding new components, the preferred airflow through the chassis is disturbed and air may stop flowing in particular areas of the chassis. These areas are commonly referred to as "dead spots."
A particular problem arises with respect to preventing "dead spots" around I/O or any other peripheral cards. I/O cards, for example, are generally arranged in closely spaced rows and located at or near the back of the computer chassis. Because of the arrangement and placement of the I/O cards, existing systems have difficulty in properly cooling them. Even though I/O cards generate relatively small amounts of heat, they can generate enough heat to cause a particular card and/or other computer system components to fail.
Because failure of any electrical component could disrupt the operation of the entire computer system, it is desirable to have a computer chassis that produces high efficiency cooling, cools all electrical components housed inside the chassis, minimizes system down time, and adapts for different electrical component configurations.