Field of the Invention
The present invention relates in general to the field of computers and similar technologies, and in particular to software utilized in this field. Still more particularly, it relates to a method, system and computer-usable medium for detecting the cause of a system hang in a verification environment.
Description of the Related Art
Today's computing environments continue to grow in scale and complexity, placing ever-greater demands upon system performance, reliability and availability. These demands often result from the constantly increasing amount of data sharing and volumes of transaction processing inherent in large system applications. Another aspect of these demands is the unpredictability of their workloads, which mandate that these systems not only be highly scalable, but also support concurrent processes that may unexpectedly conflict with one another and cause the system to hang. As a result, it is common to conduct hardware testing in a verification environment to detect potential causes of system hangs.
However, certain classes of system hangs are difficult to expose using traditional random irritation techniques. For example, deadlock is a situation where two or more processes in a data processing system are unable to proceed because each is waiting for one of the others to do something. A common example is a program communicating to a server that may be in a state of waiting for output from the server before sending any additional data to the server. Meanwhile, the server is similarly waiting for more input from the controlling program before it is able to produce an output.
Another example of a system hang is a livelock, which is similar to a deadlock, except that the state of the two processes involved in the livelock constantly changes with regards to the other process. For example, two or more processing elements may be stuck in loops because each processing element repeatedly reaches a point in the loop where it must request the other to retry a particular command. A livelock can occur, for example, when a process that calls another process is itself called by that process. Such livelocks may be caused by a software or hardware design issue.