Prognostic methods are used to improve the reliability of deployed systems by looking at components that have high failure rates and critical impact on performance within the systems. Detectors, or sensors, monitor these systems and look for failure precursors that indicate the high-failure rate components have entered a wear-out mode and are degrading toward failure. By knowing the progression of failure dynamics for a device, an accurate prediction of time to failure or remaining useful life (RUL) can be made and an appropriate maintenance action, such as remove and replace the device, can be initiated to avoid system failure during a time of operation. Fault-to-failure signature detection is a method or capability to detect and report a precursor-to-failure or incipient fault condition of a component device or assembly containing the component. Such detection is the basis for a notification capability to provide early warning of degradation and eventual failure.
A direct approach is to place sensors at the board level at each node of each component having a significant rate of failure: faults are detected and tracked. In many cases, there are interdependencies in these tracking measurements that require an expert system to produce RUL estimates with greater accuracy. This direct approach is invasive because it requires internal access to components within the system: adding sensors inside the power supply imposes an additional reliability load. Manufacturers of switch-mode power supplies can be reluctant to enable prognostics on their supplies, believing that the benefits of this capability do not justify the cost and/or that these benefits do not outweigh the additional reliability burden. Currently, this direct approach to prognostics has not been adopted in many applications. A non-invasive approach uses external access methods, such as using an output voltage terminal, to attach electronic equipment to measure values, inject stimuli and sense responses.
FIG. 1 is a simplified block diagram of a switch mode power supply, (hereinafter “SMPS”) 1, in accordance with the known prior art. The SMPS 1 has a direct voltage input 3. A SMPS uses relatively high-frequency switching devices, such as a power Metallic Oxide Semiconductor Field Effect Transistor (hereinafter “MOSFET”) switch, sometimes contained within a circuit. In FIG. 1, the MOSFET switch is contained within a Pulse Width Modulator (hereinafter “PWM”) 5. Switching frequencies of one-hundred-thousand Hertz (100 kHz) or higher are used to convert the input direct voltage to a first pulsed waveform 7. One or more high-frequency transformers in Isolation device 9 with one or more outputs are used to provide an isolation barrier and to buck or boost the first pulsed waveform 7 to a second pulsed waveform 11 having a different voltage amplitude. The second pulsed waveform 11 is rectified and filtered by an output filter 13 to produce one or more direct voltages with a positive terminal 15A and a negative terminal 15B at voltage levels different from the direct voltage input 3. Feedback is provided through the use of either a small pulse transformer or an opto-isolator in a feedback circuit 17. The feedback circuit 17 produces feedback output 19 to PWM 5 to control the pulse width or pulse frequency or both of the switching devices in the PWM 5 to regulate the output direct voltage across the terminals 15A and 15B.
FIG. 2 is a schematic diagram of an opto-isolator 21 from the feedback circuit 17 of FIG. 1, in accordance with the known prior art. Referring to FIG. 1 and to FIG. 2, in an SMPS 1, there are three components that are at the root of the majority of failures; (1) an output capacitor in the output filter 13, (2) a power MOSFET switch in either or both the Pulse Width Modulator 5 and the Isolation device 9, and (3) an opto-isolator in the feedback circuit 17. The basic operation of an opto-isolator 21, as shown in FIG. 2, is the following: an input current flows into a first opto-isolator input terminal 23A, through a light-emitting diode 25 and out through a second opto-isolator input terminal 23B; the current flow through the light-emitting diode 25 produces light 27; the produced light 27 causes current to flow in the base of an opto-isolator transistor 29, which causes current to flow through the opto-isolator transistor 29 via output terminals 31A and 31B. The input current to the opto-isolator 21 is the result of circuitry to which the opto-isolator 21 is connected in feedback circuit 17 in FIG. 1; the collector at the top of the opto-isolator transistor 29 is connected to circuitry in the Pulse Width Modulator 5 shown in FIG. 1.
The ratio of the output current to the input current of an opto-isolator 21 is defined as the Current Transfer Ratio (CTR) of the opto-isolator 21. Feedback circuits 17 are designed to cause a regulated and set direct output voltage, VDC, across output terminals 15A and 15B in FIG. 1. The design of the feedback circuit 17 is predicated on a CTR greater than 1. As an opto-isolator 21 is stressed during operation, for example by heat or current, the crystal lattice of the semiconductor material comprising the opt-isolator 21 may develop defects, especially point defects. Thus, the opto-isolator 21 is damaged. Such defects reduce the light emitting efficiency of the light-emitting diode 25, and the amplification factor of the opto-isolator transistor 27 is reduced, which reduces the CTR. When the CTR is reduced, the opto-isolator 21 is said to be operating in degraded state. When the stress conditions are removed, the crystal lattice self-anneals and most, but not all, of the lattice damage is repaired. After repeated cycles, the CTR is gradually reduced so low that the SMPS 1 fails to correctly regulate the direct output voltage, VDC, and the SMPS 1 is deemed to be defective. A defective SMPS 1 is typically removed and replaced, with the removed SMPS 1 being set aside for subsequent testing, evaluation and repair. Because it is set aside, the stress conditions are removed, the opto-isolator 21 self-anneals and the CTR increases. Often the increase in the CTR is sufficient to result in correct voltage regulation during a re-test, and the SMPS 1 is evaluated as not being defective. This intermittent fault behavior, defective in operation but okay during re-test, is a major contributor to a high number of SMPS's being swapped out as defective and being returned to regular usage after maintenance fails to duplicate the defective behavior.
The primary fault-to-failure progression of an opto-isolator 21 is a gradual decrease over time of the current transfer ratio (CTR) from a value greater than 1.0 and typically less than 3.0 to a value below 0.2. As the CTR decreases, the ability of the SMPS 1 to regulate the direct output voltage VDC degrades until the SMPS 1 is unable to adequately regulate the direct output voltage VDC. Of interest are the following: (1) the identification and characterization of a fault-to-failure progression signature for opto-isolators 21 in feedback loops of SMPS's 1; (2) a method to use the fault-to-failure progression signature in the design and implementation of a non-invasive sensor for prognostication of opto-isolators 21 in the feedback loop; and (3) a measure of time remaining before the CTR will decrease to a value so low as to cause the power supply to fail to regulate the direct output voltage VDC.
There are a number of prognostic methods to identify failure precursors in switching power supplies that rely on internal or external measurements. One known method is to monitor ripple voltage at the output terminals of the power supply as discussed in Layyani, “Failure Prediction of Electrolytic Capacitors of a Switch Mode Power Supply, IEEE Transactions on Power Electronics, Vol 13, No. 6, November 1998. The precursor to failure of the method of Layyani is an increase in ripple voltage caused by increasing degradation of the capacitor as it fails.
A second method disclosed in U.S. Pat. No. 4,245,289, to Mineck, is to measure the duty cycle modulated by an integrated circuit (IC) component that is responsible for switch timing in a regulated power supply. Mineck is based on the premise that electronic components consume more power as they begin to fail. Overall efficiency decreases with the increased power consumption. As the output is regulated to produce a predetermined and set value, the switching duty cycle must compensate by changing the relative on- and off-times. The precursor-to-failure utilized in Mineck is an increase in duty cycle. The method of Mineck is non-specific with regard to exactly which component is failing. Furthermore, the method of Mineck is invasive because it requires measurement of an internal node voltage waveform.
An obvious prognostic method might be to measure the difference between the actual direct output voltage and a designed-for direct output voltage set point. Any trend in the difference value indicates that the power supply is degrading along a trajectory toward failure; but as a prognostic or precursor to failure for an opto-isolator 21, this approach is limited because the SMPS 1 will continue to regulate the direct output voltage up to the point where the opto-isolator 21 actually fails: there is an absence of any precursor to failure from the direct output voltage.
None of the discussed prior art methods have the ability to detect increasing degradation in the operation of an opto-isolator 21 in a feedback loop of a SMPS 1, and there is no known method or means to predict the failure of this component using only measurement at external nodes, such as the direct output voltage terminal of the SMPS 1.
There are existing patents concerned with monitoring of the direct output voltage and feedback circuits 17. In those cases, the monitoring is used for control and provides no information regarding the health of the feedback components. These inventions are in a different category from the prognostic inventions. There are also patent applications related to processing of monitored data in switching supplies, such as US patent application publication no. 20050289378 in which an integrated circuit approach for multi-parameter monitoring is proposed. Patent application publication no. 20030039129 detects an abrupt load change and is implemented as part of a strategy to reduce overshoot and improve overall transient response.