The invention relates generally to the field of error management in a computer program.
The term xe2x80x9cbootingxe2x80x9d refers to a process of loading and executing programs, principally those making up an operating system, in order to prepare a computer system for use by a user. The booting process is said to be made up of a number of operations. As computers become more powerful and capacious and as operating systems continue to evolve, the number of operations invoked during the booting process increases. Additional operations whose invocation succeeds typically add useful functionality to the operating system by, for example, automatically identifying, configuring, and initializing hardware devices installed in the computer system. On the other hand, a significant number of these operations are susceptible to failure when invoked on computer systems having particular hardware or software configurations. When these operations fail during the booting process, they often prevent the booting process from completing, thus making the computer system unusable by the user.
Conventionally, when one of these operations fails during the booting process, a knowledgeable user may be able to diagnose and resolve the problem by identifying, via trial and error, the operation that is failing, then manually modifying the sequence of operations invoked during the booting process tot exclude the identified operation. Where the user is capable of diagnosing and resolving the problem alone, the process is arduous and time-consuming. On the other hand, when the user is incapable of diagnosing and resolving the problem alone, the computer system generally remains unusable until the user can obtain assistance.
Accordingly, an automated system for identifying operations that fail during the booting process and reversibly preventing their invocation during future iterations of the booting process would have significant utility.
The present invention is directed to tracking and managing operations, such as device initialization operations, that may fail on some computer systems. In accordance with a preferred embodiment of the invention, an operation invocation management software facility (xe2x80x9cthe facilityxe2x80x9d) prevents the invocation of operations that have been determined to fail on the current computer system. When operations that have neither been determined to fail on the current computer system nor been determined not to fail on the current computer system are attempted, they are placed on a stack of outstanding attempted operations. When such operations complete successfully, the facility removes them from the stack and marks them as having been determined not to fail on the current computer system. When the execution of the operating system concludes, e.g., when the operating system crashes, if one or more operations are still on the stack, these operations have not completed successfully. In such cases, the operation on the top of the stack, i.e., the most recently begun operation, is marked as having been determined to fail on the current computer system.
During each iteration of the booting process, the operation on the top of the stack is marked as having been determined to fail on the current computer system. The stack is then cleared and used to monitor the operations attempted during the current boot process.
The facility preferably includes a user interface for displaying to the user the set of operations that have failed since the last successful boot. This user interface may preferably be disabled for unsophisticated users. Also, in one embodiment, for certain failed operations, special code specific to the operations can be executed to either (1) describe the failure or (2) remediate the failure. A second user interface preferably allows the user to review a log of operation failures noted by the invention.
The facility preferably uses universally unique identifiers (UUIDs), such as globally unique identifiers (GUIDs), to uniquely identify different operations, so that software developers who do not communicate can add new operations to the set of operations without creating an operation identifier conflict. A UUID contains both a time identifier and a machine identifier. The time identifier ensures that two UUIDs produced on the same machine are unique because they are produced at different times. The machine identifier, on the other hand, ensures that two UUIDs produced at the same time are unique because they were produced on different machines. Operation attempts, or xe2x80x9coperation instances,xe2x80x9d are further identified by additional instance data that distinguish separate attempts of the same operation.
The facility thus is able to identify the operations that typically fail on the current computer system and to protect the computer system from possible future failure by preventing the invocation of such operations.