Various approaches have been proposed for handling errors or failures in computers. Examples include U.S. Pat. No. 6,170,067, System for Automatically Reporting a System Failure in a Server (Liu et al., Jan. 2, 2001); it involves monitoring functions such as cooling fan speed, processor operating temperature, and power supply. However, this example does not address software errors. Another example is U.S. Pat. No. 5,423,025 (Goldman et al., Jun. 6, 1995); it involves an error—handling mechanism for a controller, in a large-scale computer using the IBM ESA/390 architecture. In the above-mentioned examples, error—handling is not flexible; error—handling is not separated from hardware, and there is no dynamic tuning.
Generally, if a software product has any ability to handle errors, that ability is limited and inflexible. Conventional software fixes can be time-consuming to develop, and difficult to apply. Conventional software error messages often are not unique and not informative. Thus there is a need for flexible solutions that lead to a useful response; at the same time, the burden of reprogramming needs to be reduced, and the destabilizing effects of major code revisions need to be avoided.