1. Technical Field
This invention generally relates to computer systems, and more specifically relates to apparatus and methods for diagnosing run-time problems in computer systems.
2. Background Art
Since the dawn of the computer age, computer systems have evolved into extremely sophisticated devices, and computer systems may be found in many different settings. Computer systems typically include a combination of hardware, such as semiconductors and circuit boards, and software, also known as computer programs. As advances in semiconductor processing and computer architecture push the performance of the computer hardware higher, more sophisticated computer software has evolved to take advantage of the higher performance of the hardware, resulting in computer systems today that are much more powerful than just a few years ago.
As the sophistication and complexity of computer software increase, the more difficult the software is to debug. Debugging is the process of finding problems, or “bugs”, during the development of a computer program. Most modern programming environments include a debugger that provides tools for testing and debugging a computer program. Known debuggers allow the user to set one or more breakpoints in the computer program, which are points where the execution of the computer program is stopped so that the state of the program can be examined to verify that the program executed as designed.
Another type of problem that can occur is a run-time problem that is not a “bug” per se, but is a problem that arises due to run-time conditions at the time the computer program is executed. One such type of run-time problem is performance problems that arise due to excessive demand on computer system resources, such as performing an excessive number of I/O operations in a given period of time. Both bugs and run-time performance problems are collectively referred to herein as run-time errors. Most modern programming languages support defining an event known in the art as a software “exception” that represents a portion of code that is run when a defined run-time error occurs. Different exceptions may be defined to represent different run-time errors. For example, a “disk I/O exception” could be defined that is called if a write to a disk is not successful. A “class not found” exception could be defined that is called when an attempt is made to load an object oriented class that is not present. Exceptions provide a way to execute a desired portion of code when a run-time error occurs.
Programmers often use exceptions to debug their code and to find run-time errors. However, many complex computer systems in operation today routinely throw hundreds and even thousands of exceptions during normal operating conditions. When a real problem occurs, the number of exceptions can rise to even greater levels. A human programmer would have a hard time wading through thousands of logged exceptions to try to determine which occurred during normal processing and which occurred due to some unexpected problem. Without a mechanism and method for more specifically defining criteria for run-time errors, and automatically initiating diagnostic functions when the defined criteria are met, the computer industry will continue to suffer from inefficient methods and tools for locating the cause of run-time errors in a computer system.