This invention relates generally to error detection in electronic circuits and, more particularly, to detection of signal errors in digital systems that might occur in a transfer medium between a source device such as a plug-in card and a destination device such as another plug-in card.
Most digital systems require the transfer of signals between integrated circuit devices (ICs). Sometimes these devices reside on a single printed wiring board (PWB). Sometimes the source is on one PWB and the destination device is on another PWB. In the latter case the signals usually travel from one board to the other through one or more electromechanical connectors made from pins and mating receptacles.
Digital system operations rely on error-free signal transfers. Real world environments, however, can introduce errors in the path between devices. For example, the source device""s output driver or the destination device""s input receiver circuit can fail due to any of a variety of reasons associated with the normal life expectancy of an IC or from environmental stress such as static discharge or power surges. More likely, the electrical connection between an IC and a PWB can fail through mechanical stress or oxidation of a cold solder joint. In the case of socketed devices or signal transfers through PWB connectors, pins can become bent, broken, or otherwise damaged, damaging the electrical connection and causing signal transfer errors.
It is often desirable to detect a signal transfer error as soon as possible so that exception handling procedures can be initiated. At a minimum, these procedures should alert the user that an error has occurred. In more critical applications, these procedures should prevent known-corrupted data from propagating to and rendering useless the rest of the digital system, thereby allowing it to continue operating soundly at a reduced performance level. For example, digital systems containing large amounts of physical memory often realize that memory through the repetitious connection of many smaller memory elements (each one being either one IC or a PWB containing several ICs). The successful detection of signal transfer errors within the memory subsystem could trigger a response to block out, or stop using, the affected portion of memory, allowing operation of the rest of the subsystem to continue until an operator can replace the affected portion. Another example is in the case of a computer network switch, a system designed to intelligently route data between several (often hundreds) of network connections, or ports. Each port, as part of its routing function, uses resources shared by all the other ports, such as memory for temporary data storage. Management of the shared resources is often a centralized function, thereby necessitating the transfer of control signals between each switch port and a central control unit. Transfer errors between any one port and the central control unit have the potential of affecting all ports and causing data corruption throughout the switch. By promptly detecting and locating such errors, offending ports can be disabled before they affect the rest of the network traffic through the switch.
Prior techniques for detecting signal transfer errors in real time include parity and more complex Hamming code data added to digital signals. However, these techniques increase the xe2x80x9csignal countxe2x80x9d of the system (i.e., the number of signal paths required to carry the data signal and the error-detecting data) and thereby introduce additional points of potential failure. Power-on diagnostics or diagnostics done during an off-line period are also known, but these diagnostics do not detect signal transfer errors as they are occurring, allowing corrupted data to go undetected for some time.
An objective of the invention, therefore, is to provide an improved method for detecting signal transfer errors in near real time in a digital system.
Method and apparatus for detecting signal transfer errors in a digital logic system that might occur in a transfer medium between a source device and a destination device. The method includes sending a first diagnostic signal of one or more bits from the source device through the transfer medium to the destination device; comparing the first diagnostic signal received by the destination device with a second diagnostic signal within the destination device to determine if a signal transfer error has occurred; inverting the first diagnostic signal; sending the inverted first diagnostic signal from the source device through the transfer medium to the destination device; and comparing the inverted first diagnostic signal received by the destination device with the second diagnostic signal to determine if a signal transfer error has occurred. Two embodiments of the invention are disclosed. To provide near real time detection without adding signal paths, the diagnostic signals are sent along established signal paths during a diagnostic clock cycle that is added to the normal clock cycles of the digital system.
Other features of the invention will become apparent from the following description of two illustrative embodiments and the accompanying drawings.