1. Field of the Invention
This invention relates generally to an improvement in handling errors in a data processing system, and more specifically to detecting, diagnosing, and handling deadlock errors occurring in a data processing system.
2. Description of the Prior Art
Many data processing systems (e.g., computer systems, programmable electronic systems, telecommunication switching systems, control systems, and so forth) detect different types of errors. Some errors indicate a minor problem while other errors indicate a serious problem. Because data processing systems are being designed to offer higher percentages of xe2x80x9cup-time,xe2x80x9d it is critical to know how severe an error is and whether the system must be shut down to limit data corruption, or if the system can continue to operate without impact to the user.
These are some typical error levels of severity:
(1) An advisory error does not interrupt normal operations and is recorded only for informational purposes.
(2) A correctable error is an error that can be corrected by hardware or software and which is logged.
(3) An uncorrectable error is an error which may require some software help to keep the error contained and keep the system running.
(4) A fatal error is an error that can cause data corruption if the data processing system or subsystem is not halted immediately.
(5) A deadlock failure occurs when two or more processes are competing for the same resource, or when these processes cannot proceed to completion because the resource is unavailable.
There have been several ways to log and report errors in data processing systems. Most data processing chips provide an error logging and recovery strategy for likely errors. However, unforeseen errors (which might be design mistakes) could cause all chip processing to halt, preventing the usual error handling. Such errors are called deadlock errors, and result in the data processing system appearing to xe2x80x9cfreezexe2x80x9d until it is manually reset, or a watchdog device performs the reset.
Most data processing systems do not even attempt to handle deadlock error situations. Those systems that attempt to handle such errors typically set up some type of external watchdog device that detects when the data processing system is not making some checkpoint or progress for a period of time. This watchdog device, since it is external, cannot determine the cause of the deadlock error, and therefore can only reset the system and assume that the deadlock error will not happen again. This watchdog device cannot determine which component is unavailable, and it adds extra cost to system deployment.
Other more specific types of system reset have been tried in the past. Some bus protocols provide a special signal that causes a reset in all bus states, but this special signal ignores all pending transactions. The disadvantage of these prior art strategies is that they only work on one bus at a time (a chip connecting to multiple buses would need many different detection circuits) and are complex to implement. Since these strategies generally do not reset all chip states through the already existing reset circuitry, these special signals become require a significant amount of extra logic, and thus are susceptible to many design errors themselves.
In typical prior art systems, no deadlock information is recorded in the error register to allow software or users to determine when or why multiple deadlock errors have occurred. Such deadlock error information would be desirable to allow software or users to determine if deadlock errors are occurring, what is causing the deadlock error, and if a system reset after a severe error is caused by a deadlock error. For example, a system reset could continuously reoccur if deadlock errors are not disabled and the cause of a deadlock error is not corrected.
It would be desirable to have the capability to enable or disable deadlock errors, record extensive information about deadlock errors, and be able to determine from the error log registers after a system reset that the system reset was caused by a deadlock error.
An object of the invention is to provide the capability to enable or disable deadlock errors, record extensive information about deadlock errors, and be able to determine from the error log registers after a system reset that the system reset was caused by a deadlock error.
A first aspect of the invention is directed to a method for indicating a deadlock error in a data processing system capable of having at least one deadlock error. The method includes indicating that an error is at least one deadlock error, providing an input signal to set a deadlock error enable circuit having an output signal indicating that the deadlock error will cause a deadlock reset signal to be asserted, logically ORing one or more signals from said at least one deadlock error, with a first combinational logic circuit having an deadlock output, and logically ANDing the deadlock output of the first combinational logic circuit and the output signal of the deadlock error enable circuit with a second combinational logic circuit having an output to produce the deadlock reset signal.
A second aspect of the invention is directed to a data processing system or error log system, capable of having a deadlock error selected from a plurality of deadlock errors. The data processing system or error log system includes a deadlock error enable circuit receiving a plurality of input enable signals and having an output signal indicating that the deadlock error will cause a deadlock reset signal to be asserted, a first combinational logic circuit to logically OR the plurality of deadlock signals, having an deadlock output, and a second combinational logic circuit to logically AND the deadlock output of the first combinational logic circuit and the output signal of the deadlock error enable circuit, having an output to produce said deadlock reset signal.
These and other objects and advantages of the invention will become apparent to those skilled in the art from the following detailed description of the invention and the accompanying drawings.