The invention relates to the monitoring and replacement of field replaceable units (FRUs) for electronic equipment, for example for a telecommunications or other application where high standards are set and where the unit may, for example, be remote from a service center and the replacement may need to be effected by non-skilled personnel.
FRUs can be used in many different systems. They find particular but not exclusive application to computer systems, for example to fault tolerant computer systems where it is desirable to be able readily to replace units which have developed a fault or have been superseded by a more recent version.
Examples of FRUs for such a system can include, for example, a CPU, a PCI card, power supply units (PSUs), a motherboard, or any other system components, One FRU, for example a field replaceable card, can include hardware for implementing several devices (e.g. a multiple Ethernet adapter, or a SCSI adapter with an Ethernet adapter).
It is known to provide FRUs with non-volatile memory (e.g. EEPROMs), which can contain information relating to the FRU. In a known system, FRUs can include basic FRU identification information in the non-volatile memory.
It is also known to provide a system management suite, collectively known as a configuration management system (CMS) which manages the FRUs, other devices and system resources using objects to represent the FRUs, devices and other system resources. An object forms a particular instance of a CMS class, which is defined by a CMS definition (CMSDEF).
For example, a CAF (Console and Fans unit) CMSDEF defines the CAF CMS class of which the object CAF_1 is an instance that represents a particular CAF FRU. The CAF 1 object may have an attribute called LOCATION having the value A. CAF, indicating that the FRU represented by the CAF_1 object has been inserted into location A. CAF in the chassis of the computer system.
In order correctly to manage the FRUs, the CMS requires access to the non-volatile memory in the FRUs. In order to gain access to the non-volatile memory in the FRUs, it is necessary that power is supplied to the FRUs. However, this conflicts with safety requirements relating to telecommunications equipment which require that where a FRU is faulty it necessary to powder down the FRU.
It is known to provide a fuse on a FRU to isolate circuitry of the FRU in the event of an electrical fault. However, in the event of a fault occurring at the interconnections to the FRU, for example in the event of a short circuit between connector pins, the fuse may not protect against this. It would be possible to locate such a fuse in a power supply sub-system of the electronic equipment such that it would also detect faults at the interconnection to the FRU. However, in the event of a fault, it would be necessary for the maintenance engineer to replace or reset the fuse in addition to replacing the FRU.
Accordingly, the present invention seeks to address the powering of a FRU in a manner that can provide protection against faults, while not complicating the tasks required of a maintenance engineer when replacing a faulty FRU.
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Combinations of features from the dependent claims may be combined with features of the independent claims as appropriate and not merely as explicitly set out in the claims.
In accordance with a first aspect of the invention, there is provided a power sub-system for controlling a supply of power to a FRU for electronic equipment. The power sub-system includes a power controller that is arranged, in response to the detection of a fault, to switch off the supply of power to a FRU. The power sub-system is further operable subsequently, in response to a sequence of two events, to switch on the supply of power to the FRU. The first event is a first change in state of an interlock signal indicative of the FRU being released. The second event is a second change of state of the interlock signal indicative of a FRU being secured in position.
By causing power to be cut on detection of a fault, and then restored after an indication of the FRU being released followed by an indication of a FRU being secured in position, the temporary interruption of power to the FRU location is managed automatically.
An embodiment of the invention thus provides significant advantages over systems where a fuse or other trip device requires a maintenance engineer to replace the fuse of reset the trip manually. The maintenance engineer does not need to perform any actions other than the removal of the FRU and the replacement of that, or another replaceable unit, to restore the power. Accordingly, an embodiment of the invention enhances safety and security during maintenance operations when hot swapping FRUs. Typically, a replacement FRU would be used to replace a faulty FRU that is removed. However, it is also possible that the same FRU could be reused if the fault were replaced, or perhaps a unit on the FRU was reset, or the like.
The provision of the arrangement for controlling the supply of power separate from the FRU means that the power subsystem can detect and address faults associated with the connections between the power sub-system and the FRU (e.g., a short between individual connectors) as well as faults within the FRU itself. An embodiment of the invention thus provides further advantages over an arrangement where a fuse element on the FRU is used to isolate an electrical fault.
In an embodiment of the invention, the power controller includes a logic circuit responsive to a fault signal to switch off the supply of power and responsive to the first and second changes of state of the interlock signal to switch on the supply of power. However, in other embodiments, a suitable programmed microcontroller or microprocessor could be employed to implement the control logic.
A semiconductor switch (e.g., a transistor switch), under the control of the power controller, can provide for switching on and off of the supply of power to a power line to the FRU. A sensor circuit responsive to an overcurrent on the power line can be used to detect an electrical fault of the FRU or a fault in the connections between the power sub-system and the FRU. The logic circuit is connected to the sensor circuit to receive the fault signal therefrom in response to the overcurrent on the power line.
In an embodiment of the invention, an interlock signal line carries an interlock signal when the FRU is locked in the electronic equipment.
The interlock signal is preferably a predetermined potential on the interlock signal line. The first change in state can be the removal of the predetermined potential and the second change of state can be the reinstatement of the predetermined potential. In an embodiment of the invention, the predetermined potential is ground potential.
Debounce logic can be provided between the interlock signal line and power controller for debouncing the interlock signal. This avoids intermittent contact (e.g., due to switch bounce) unintentionally triggering the reinstatement of power following an interruption due to a fault.
In accordance with another aspect of the invention, there is provided electronic equipment including a power sub-system for controlling the supply of power to a FRU, the power subs-system comprising a power controller that is arranged, in response to the detection of a fault, to switch off a supply of power to a FRU; and subsequently, in response to a first change in state of an interlock signal indicative of the FRU being released, followed by a second change of state of the interlock signal indicative of a FRU being secured in position, to switch on the supply of power to the FRU.
The FRU can be a computer system component. The computer system can be a rack-mounted computer system, for example, a fault-tolerant computer system.
In accordance with another aspect of the invention, there is a FRU including an interlock mechanism for locking the FRU in the electronic equipment. An interlock switch is operated by the interlock mechanism and causes an interlock signal line to be connected to a source of the predetermined potential when the interlock mechanism locks the FRU in the electronic equipment.
In this manner, the interlock signal is provided automatically when the FRU is locked in position in the equipment, and is interrupted when the lock is released.
The power sub-system and the FRU comprise co-operating connector arrangements for interconnecting a plurality of power and signal lines of the power sub-system to a corresponding plurality of power and signal lines of the FRU. Among those power and signal lines in the power sub-system and the FRU are a main power line for the supply of power to the FRU, a ground line, and an interlock signal line.
In a particular embodiment of the invention, the FRU is a PCI card carrier assembly. Moreover, the FRU comprises power conversion circuitry for supplying different voltages to a connectable PCI card.
In accordance with yet another aspect of the present invention, there is provided a method of controlling a supply power to a FRU for electronic equipment, the method comprising: in response to the detection of a fault, switching off the supply of power to a FRU; and subsequently, in response to a first change in state of an interlock signal indicative of the FRU being released, followed by a second change of state of the interlock signal indicative of a FRU being secured in position, switching on the supply of power to the FRU.
Thus, in accordance with an embodiment of the invention, a power sub-system controls a supply of power to a FRU for electronic equipment. The power sub-system includes a power controller that is arranged, in response to the detection of a fault, to switch off the supply of power to a FRU. The power controller is then responsive to a sequence of two events to switch on the supply of power to the FRU. The first event is a first change in state of an interlock signal indicative of the FRU being released. The second event is a change of state of the interlock signal indicative of a FRU being secured in position.
An advantage of the invention that should be apparent from the above is the automatic manner in which power can be removed and then reinstated, without specific acts being required of a maintenance engineer other than the mechanical operations that are necessary to remove and replace a FRU. This reduces the time needed to replace the FRU, and avoids further errors as a result of a maintenance engineer failing to restore power to the subsystem as would be necessary if a fuse or a conventional trip were used.
Further objects and advantages of the invention will be apparent from the following description.