Applications and uses for high performance electronic devices and systems have extended into almost every aspect of modern life. This wider usage exposes electronic systems to ever more harsh thermal, environmental and electromagnetic conditions. Meanwhile, the same advances in semiconductor miniaturization and system packaging advances which makes these performance and capability increases possible, also tend to suffer higher sensitivities to damage and functional upset, especially due to unavoidable electromagnetic transients. Integrated circuits (ICs) within these systems are susceptible to damaging and disrupting electrostatic discharge (ESD) pulses, lightning surges, other electrical fast transients, and/or effects of single event upsets (SEUs) in the operating environment. System designers must carefully balance cost constraints and performance demands with minimum robustness requirements in their products. Consideration must be given not only to the end user environment, but also to the development and manufacturing phases of production.
Currently there are a number of solutions for characterizing and qualifying system and component transient robustness, fewer methods for monitoring manufacturing and assembly processes, and extremely limited options for transient induced field failure identification and analysis.
Industry expert groups have carefully characterized various aggressor transients and defined standards and specifications of transient pulse simulator systems for repeatable product characterization and qualification. Various system product industries (consumer, computer, automotive, etc.) tend to select appropriate levels of these defined pulse characteristics, test methods, and failure criteria as a minimum qualification level for that product class. These well-defined and universally accepted testing standards primarily address this qualification aspect of the final system as a unit, with little, if any, information extractable from the results regarding individual subsystem components and failure mechanisms, and provide almost no insight into the statistical margins between pass and fail.
Some attempts have been made to adapt these system level tests in order to isolate and instrument device nodes for analysis, but these solutions fail to meet the needs of the industry because the probing techniques are either invasive (i.e. the system must be disassembled, thus breaking the system integrity and exposing it to unrelated induced electromagnetic fields from pulse simulators) or they do not directly correlate back to the final qualification test pass/fail criteria (for example, common device leakage current tests and software upsets or failures do not necessarily correlate). Other solutions attempt to focus strictly on hard failures (i.e. permanent destructive circuit damage) but these solutions are similarly unable to meet the needs of the industry because hard failures are only one aspect of system robustness. Soft failures (e.g. software upsets and recoverable system resets) are increasing in prevalence, along with smaller semiconductor process technology, lower process voltages, and faster circuitry.
Still other solutions seek to adapt electromagnetic interference (EMI)/electromagnetic compatibility (EMC) three-dimensional scanning systems in conjunction with injected pulse generator simulators in current reconstruction and transient susceptibility systems. These systems attempt to infer the transient currents into and out of device pins or nodes in the system from the measured electromagnetic H-fields and E-fields, or to actively inject them for observation. These systems attempt to provide a detailed estimate of which local device is affected by the transient pulse effects and how each device is affected by the residual transient pulse after being attenuated by protection devices. But these solutions also fail to meet industry needs, for example, because a 5 cm×5 cm printed circuit board (PCB) can take as much as 20 hours to scan and only for a single port I/O, they are inordinately expensive due to the precision scanning hardware required, and finally they can only be applied to a sub-assembly or planar PCB which must be accessed on one open side (not mounted in the system enclosure, nor in a daughterboard backplane configuration).
Many advances have meanwhile been made in the area of electrostatic discharge (ESD) process controls in the product manufacturing environment (ESD being one of the most common aggressor transient pulse issues confronting system designers). ESD event detectors and continuous real-time monitors on assembly lines can detect and alert personnel to dangerous (for the components) electromagnetic field levels due to poor grounding, handling, and packaging issues on the controlled manufacturing floor. Until these components are installed on the circuit boards and assembled into their complete enclosures, they are substantially more susceptible to transient induced damage. Sporadic failures on a system manufacturer's poorly controlled assembly line are often blamed on insufficiently robust devices from the component vendor. Standard operating procedure in this case is to physically remove the suspect component and return it to the component vendor (during which procedure, the components are often further damaged due to manual desoldering, handling, etc.). Then, the component vendor attempts to do a root-cause analysis, and often the microscopic evidence is inconclusive.
Routine data-collection of on-board event detection and analysis can provide a non-destructive, early indicator of manufacturing process health. This data can help identify the problem, as well as pinpoint the liability, long before removing and returning a part for expensive root cause analysis. However, existing techniques are unable to meet this need.
Another potential problem that can arise in the field is inter-block damage within chips due to internal transients on power rails during gate switching. At ultralow operating voltages, a glitch of even a few millivolts on the power rail aligned at the critical switching time of a CMOS gate can cause upset or even permanent damage.
Accordingly detecting such transients on internal nodes as well as input/output (I/O) nodes of integrated circuits of a system can be crucial to debugging the system. However, current technology does not address this need.
Additionally, typical end user environments usually have no data collection, site analysis, or even reliable or competent eye-witnesses for random ESD events and vectors. Thus, ESD electrical overstress (EOS) failures, in particular, are notoriously mischaracterized and erroneously assigned to incorrect causation, and rarely provide reliable feedback to engineering and development on robustness of the product in the field or the specific application environment requirements. Again, with lifetime product data collection through software accessible registers and interfaces, worldwide aggregate collection of ESD event statistics is possible for product reliability and for related scientific inquiries.
It would be desirable to have a device that detects and accurately characterizes transient pulses while the system is assembled in part or in whole, and that does not affect the normal operation or configuration of the system being analyzed. Furthermore, it would also be desirable to have a device that provides this detection and characterization function without the need for additional, costly, calibrated precision scanning and measurement equipment. Still further, it would be desirable to have a device for detection and characterization which can be accessed through existing, well known internal register space interfaces (such as peripheral component interconnect (PCI) configuration registers) and/or external test and debug interfaces, such as boundary scan.
Therefore, there currently exists a need in the industry for a device and associated method that that can provide the advantages of transient scanning and residual current measurement equipment, while the product is fully assembled and operating without present constraints of partial disassembly for access to PCBs and other components, all without adding appreciable additional product costs. Additionally, such a device would not only provide a method for improved and enhanced analysis and design methodologies, but since the detection and characterization device is integrated into the system, it would inherently enable methods of data collection outside of the development lab into the end user environment, for example, for reliability studies, warranty information, and field repair diagnostics.