1. Field of the Invention
The present invention relates to user interfaces for computer systems. More particularly, the present invention relates to implementing a graphical user interface (GUI) to allow for easy and efficient management and maintenance of peripheral devices in a computer network.
2. Description of the Related Technology
As enterprise-class servers, which are central computers in a network that manage common data, become more powerful and more capable, they are also becoming ever more sophisticated and complex. For many companies, these changes lead to concerns over server reliability and manageability, particularly in light of the increasingly critical role of server-based applications. While in the past many systems administrators were comfortable with all of the various components that made up a standards-based network server, today's generation of servers can appear as an incomprehensible, unmanageable black box. Without visibility into the underlying behavior of the system, the administrator must "fly blind." Too often, the only indicators the network manager has on the relative health of a particular server is whether or not it is running.
It is well-acknowledged that there is a lack of reliability and availability of most standards-based servers. Server downtime, resulting either from hardware or software faults or from regular maintenance, continues to be a significant problem with significant costs. With emerging Internet, intranet and collaborative applications taking on more essential business roles every day, the cost of network server downtime will continue to spiral upward.
While hardware fault tolerance is an important element of an overall high availability architecture, it is only one piece of the puzzle. Studies show that a significant percentage of network server downtime is caused by transient faults in the I/O subsystem. These faults may be due, for example, to the device driver, the device firmware or hardware, which does not properly handle concurrent errors, and often causes servers to crash or hang. The result is hours of downtime per failure while a system administrator discovers the failure, takes some action and manually reboots the server. In many cases, data volumes on hard disk drives become corrupt and must be repaired when the volume is mounted. A dismount-and-mount cycle may result from the lack of "hot pluggability" or "hot plug" in current standards-based servers. Hot plug refers to the addition and swapping of peripheral adapters to an operational computer system. An adapter is simply any peripheral printed circuit board containing microchips, such as a PCI card, that may be removed from or added to a server peripheral device slot. Diagnosing intermittent errors can be a frustrating and time-consuming process. For a system to deliver consistently high availability, it should be resilient to these types of faults.
Existing systems also do not have an interface to control the changing or addition of an adapter. Since any user on a network could be using a particular adapter on the server, system administrators need a software application that controls the flow of communications to an adapter before, during, and after a hot plug operation on an adapter.
Current operating systems do not by themselves provide the support users need to hot add and swap an adapter. System users need software that will freeze and resume the communications of their adapters in a controlled fashion. The software needs to support the hot add of various peripheral adapters such as mass storage and network adapters. Additionally, the software should support adapters that are designed for various bus systems such as Peripheral Component Interconnect, CardBus, Microchannel, Industrial Standard Architecture (ISA), and Extended ISA (EISA). System users also need software to support the hot add and swap of adapters within canisters, which are detachable bus casings for a detachable bus system, and which also provide multiple slots for adapters.
In a typical PC-based server, upon the failure of an adapter, the system must be powered down, the new adapter and adapter driver installed, the server powered back up and the operating system reconfigured. However, various entities have tried to implement the hot plug of these adapters to a fault tolerant computer system. One significant difficulty in designing a hot plug system is protecting the circuitry contained on the adapter from being short-circuited when an adapter is added to a powered system. Typically, an adapter contains edge connectors which are located on one side of the printed circuit board. These edge connectors allow power to transfer from the system bus to the adapter, as well as supplying data paths between the bus and the adapter. These edge connectors fit into a slot on the bus on the computer system. A traditional hardware solution for "hot plug" systems includes increasing the length of at least one ground contact of the adapter, so that the ground contact on the edge connector is the first connector to contact the bus on insertion of the I/O adapter and the last connector to contact the bus on removal of the adapter. An example of such a solution is described in U.S. Pat. No. 5,210,855 to Bartol.
U.S. Pat. No. 5,579,491 to Jeffries discloses an alternative solution to the hot installation of I/O adapters. Here, each hotly installable adapter is configured with a user actuable initiator to request the hot removal of an adapter. The I/O adapter is first physically connected to a bus on the computer system. Subsequent to such connection, a user toggles a switch on the I/O adapter which sends a signal to the bus controller. The signal indicates to the bus controller that the user has added an I/O adapter. The bus controller then alerts the user through a light emitting diode (LED) whether the adapter can be installed on the bus.
However, the invention disclosed in the Jeffries patent also contains several limitations. It requires the physical modification of the adapter to be hotly installed. Another limitation is that the Jeffries patent does not teach the hot addition of new adapter controllers or bus systems. Moreover, the Jeffries patent requires that before an I/O adapter is removed, another I/O adapter must either be free and spare or free and redundant. Therefore, if there was no free adapter, hot removal of an adapter is impossible until the user added another adapter to the computer system.
A related technology, not to be confused with hot plug systems, is Plug and Play defined by Microsoft.RTM. Corporation and PC product vendors. Plug and Play is an architecture that facilitates the integration of PC hardware adapters into systems. Plug and Play adapters are able to identify themselves to the computer system after the user installs the adapter on the bus. Plug and Play adapters are also able to identify the hardware resources that are needed for operation. Once this information is supplied to the operating system, the operating system can load the adapter drivers for the adapter that the user had added while the system was in a non-powered state. However, to date, Plug and Play has only been utilized for the hot docking of a portable computer to an expansion base.
Therefore, a need exists for improvements in server management which can result in continuous operation despite adapter failures. System users should be able to replace failed components, upgrade outdated components, and add new functionality, such as new network interfaces, disk interface adapters and storage, without impacting existing users. Additionally, system users need a process to hot add their legacy adapters, without purchasing new adapters that are specifically designed for hot plug. As system demands grow, organizations must frequently expand, or scale, their computing infrastructure, adding new processing power, memory, mass storage and network adapters. With demand for 24-hour access to critical, server-based information resources, planned system downtime for system service or expansion has become unacceptable.
The improvements of co-pending applications entitled "Hot Add of Devices Software Architecture" (Ser. No. 08/942,309) and "Hot Swap of Devices Software Architecture" (Ser. No. 08/942,457), as well as their related applications, all filed on Oct. 1, 1997, adds hot swap and hot add capabilities to server networks. The recent availability of hot swap and hot add capabilities requires that a user maintaining the server, usually a server network system administrator, knows or learns the numerous and complicated steps required to swap or add a peripheral device, including how to suspend the device adapters, how to power down and up the correct server slot and/or canister, etc. These steps are more fully disclosed in the co-pending applications referenced above and incorporated herein by reference. In addition, because servers have become very reliable, the system administrator will often be caught in a position of not knowing or having forgotten how to swap or add a peripheral adapter when a server malfunctions or when a new adapter needs to be added. Today's servers do not often malfunction, and the system administrator may add adapters only a few times a year.
Without detailed knowledge of the hot swap and hot add processes, the system administrator will be unable to change out and install peripheral devices. In that case, the entire server system must then be shut down, the peripheral adapter replaced or inserted, and the system restarted. This can result in severe losses to the system users in terms of network downtime and inability to service clients. In addition, it results in a failure to take advantage of the currently available hot swap and hot add technology. However, without an automated step-by-step process from the user's point of view, these results are inevitable.
Therefore, a need exists to automate, as much as possible, the hot swap and hot add processes, so that the benefits of those capabilities in a server are not compromised by insufficient technical knowledge of those processes on the part of the network administrator. Because implementation of the hot add and hot swap processes only depends on (1) which process is necessary (i.e. hot swap or hot add) and (2) which particular server peripheral device slot is concerned, the user should be able to perform these processes knowing such information. The remaining steps in the hot swap and hot add processes may be completely automated. This can allow the necessary hot swapping and hot addition of adapters to be performed quickly and efficiently, by non-expert personnel, while the server is running. In the usual case, it would allow the system administrator to perform a hot swap or a hot add with little or no learning curve.