1. Field of the Invention
The present invention relates generally to a computer system that provides a dual interrupt mechanism to designate the occurrence and termination of an event. More particularly, the present invention is directed to a system and method for facilitating alerting the computer system to event termination, such as when a hot-pluggable component is replaced within the computer system. Even more particularly, the present invention is directed to a dual interrupt control by which the computer system need not monitor or poll the status of a removed component.
2. Description of Related Art
Networks serve the purpose of connecting many different personal computers, workstations, or terminals to each other, and to a host computer, printers, file servers, etc., so that expensive computing assets, programs, files and other data may be shared among many users.
In a network utilizing a client/server architecture, the client (personal computer or workstation) is the requesting machine and the server is the supplying machine, both of which are connected via the network, such as a local area network (LAN) or wide area network (WAN). This is in contrast to early network systems that utilized a mainframe with dedicated terminals.
In a client/server network, the client contains the user interface and may perform some or all of the application processing and, as mentioned above, can include a personal computer or workstation. The server in a client/server network can be a high-speed microcomputer or minicomputer and, in the case of a high-end server, can include multiple processors and mass data storage, such as multiple hard drives and multiple CD-ROM drives. A database server maintains the databases and processes requests from the client to extract data from or update the database. An application server provides additional business processing for the clients. The network operating system (NOS) together with the database management system (DBMS) and transaction monitor (TP monitor) are responsible for the integrity and security of the server, as is understood in this art.
Client/server networks are widely used throughout many different industries and business organizations. The mass storage and multi-processing capabilities provided by current client/server network systems (i.e., high-end servers) permit a wide range of essential services and functions to be provided through its use.
As can be appreciated, many of these businesses are highly dependent upon the availability of their client/server network systems to permit these essential network services and functions to be carried out. As these client/server network systems become increasingly essential to the everyday operations of these businesses, additional steps need to been taken in the design and construction of the server in the client/server network system to ensure its continuous availability to the clients. That is to say, in the design and construction of a server, steps need to be taken to ensure that the server can be operated with little or no down time.
It should be understood, therefore, that server reliability and serviceability are two valuable design aspects in ensuring that a server will operate with little or no down time. The modularity of components within a server has been recognized as an important design consideration in ensuring that the down time of a server will be minimized. Modules can be removed and examined for operability or other purposes much easier than permanently mounted fixtures within a server chassis. When various components of a server can be easily removed in a modular manner, they can also be readily replaced to maintain the operational status of the server.
Removable modular components today include disc drives and power supplies. As referenced above, the removability of modular components allows for better overall serviceability of the computer system which is a distinct advantage. For example, a defective power supply in the server or any computer system, such as the PC or workstation, generally requires prompt replacement in order to limit downtime. Modular components and connectors facilitate prompt replacement and are thus popular in many computer designs.
Originally, a rule of practice in the maintenance of modular components or printed circuit boards of a server was that of always turning the power to the server off before any modular components or printed circuit boards were removed or added from the chassis or support frame of the server. Recent innovations have addressed the desirability to insert and remove modular components and printed cards from the chassis of the server (or any computer system) when the server is electrically connected and operational, i.e., "hot-pluggable."
Hot-pluggable components today include storage or disc drives, drive cages, fans, power supplies, system I/O boards, control boards, processor boards, and other subassemblies. The "hot" removability of these server components allows for better overall serviceability of the computer system, which is a distinct advantage to both the user and the maintenance technician.
Component redundancy has also been recognized as an important design consideration in ensuring that a server will operate with little or no down time. Essentially, component redundancy is sometimes provided to better ensure that at least one of the redundant components remains operable. Accordingly, with component redundancy, at least two components are both provided that can perform the same function, such that if one of the components becomes faulty for some reason, operation transfers over to the redundant component. When at least one of the redundant components is operable, continued operation of the computer system is possible even if others of the redundant components fail. Therefore, to further enhance reliability and serviceability, redundant components have been made hot-pluggable.
Reconfiguration of the server system can also be accomplished with upgradable modular components. This can be accomplished by the addition or substitution of components having different circuits, e.g., updated or upgraded, disposed thereupon. When components are redundant and hot-pluggable, reconfiguration of the server is often possible without taking the server offline.
Another important design aspect with redundant and hot-pluggable components is to ensure and maintain a safe working environment while the server is operating and being repaired or upgraded. Therefore the exposure of hot connectors and contacts must be kept to a minimum.
Steps are similarly also taken in the design and construction of the server system to ensure that the server system is readily serviceable, such that when the client/server network system must be serviced the down time can thereby be minimized. In existing client/server network systems it is often difficult to obtain important data corresponding to a component failure in order to facilitate the quick serviceability of the server. Therefore, the more information that can be readily provided to locate a defective component or problem with the server, the amount of time the server is down can be minimized.
A computer server is an exemplary computer system, and is typically utilized when a group of discretely-positioned computer systems are connected together in a networked fashion. The computer server, and files contained therein, is selectively accessible by any of the computers in the networked connection with the computer server. When access to the files stored at the computer server is essential to perform a particular service or function, it is imperative that the computer servers be online and available so that the files stored therein can be accessed.
A user interface for a computer system provides selected information relating to the computer system in human perceptible form to a user of the computer system. A user interface sometimes also permits a user of the computer system to input commands to the computer system. A computer keyboard and a video display terminal are exemplary of the user interfaces conventionally used in conjunction with a computer system.
When one of a pair (or multiple) of redundant devices fails, e.g., a primary or backup power unit or fan, the defective device must be replaced. Since a replacement may not be readily available due to shortage and/or demand, the server or other computer system may be required to operate without the unit for some time, perhaps weeks or more, before the replacement arrives and is installed. Conventional computer systems, such as servers, are interrupt-driven. Interrupts have been used since the introduction of the mainframe computers of the 1950's to alert the computer system, e.g., the system's processor, of special conditions occurring therein, which generally relate to Input/Output (I/O). Computer systems, however, employ a wide variety of interrupts to handle a corresponding variety of particular conditions requiring the attention of the processor. The action taken by the processor responsive to the interrupt is referred to as "serving" the interrupt.
Within servers and other computer systems employing redundant devices, the action of removing a given, hot-pluggable component triggers an interrupt mechanism to alert the system of this change in operational status. The system, in turn, initiates a polling or monitoring routine to periodically check on the status of the absent unit, i.e., to detect a consequent change back to an operational mode. It should be understood that this polling routine is itself an interrupt-driven mechanism. Depending on the periodicity of the polling schedule, the system eventually detects the replacement of the defective device. Polling is then canceled. In implementation, it should be understood that conventional computer systems may periodically poll a flag value, e.g., a power.sub.-- unit.sub.-- present flag for a given power unit, associated with the particular action or event to determine the status of the missing (flag set) or present (flag zeroed) device in question.
Since the aforedescribed periodic polling by the computer system consumes system resources, particularly processor time, it is an object of the present invention to conserve system resources and improve system performance by eliminating unnecessary polling.
It is a further object of the present invention to provide a system and method for ascertaining the termination of an interrupt-driven event upon the conclusion of that event instead of waiting for the polling routine to detect the change in event status.
It is in the light of this background information related to computer systems, redundant devices and the generation of interrupt messages that significant improvements of the present invention have evolved.