This Background is intended to provide the basic context of this patent application.
Operating systems (OS) are a key building block in the development of computing systems. Over the several decades since personal computing has become widespread, operating systems have substantially increased in complexity. The ability to multi-task and support concurrent processes gives even modest personal computers the appearance of simultaneously running a wide variety of programs from word processors to Internet browsers.
In fact, though, virtually all microprocessor-based systems run one program at a time, using a scheduler to guarantee that each running program is given processor time and system memory in sufficient quantities to keep running. This task can become quite complex. Each process running on a computer can spawn individual tasks called threads. Some threads can spawn subordinate threads. It is common to have dozens, or even hundreds, of threads active at a given time executing any number of programs or processes running in both user mode and kernel mode. To complicate matters, the computer may have limited resources, such as disk storage, network input/output, and program memory. The operating system (OS) executing on the computer coordinates all scheduling and management of various user and kernel mode processes, however, if the underlying hardware (i.e., the processor or system memory) is faulty, even perfect OS operation will not prevent a system failure.
User mode processes (i.e., word processing and other user applications) are heavily monitored by the OS and do not interact directly with the computer hardware. Because of their limited access to the computer's underlying hardware, user mode processes have a limited ability to cause a system freeze or crash in the event of an exception. Kernel mode processes (i.e., device drivers and other component interfaces) execute outside of user control, directly access hardware, and may easily cause system failure if an exception occurs. The OS relies on underlying hardware to ensure error-free user and kernel mode process function. Typical approaches to diagnosing and solving exceptions are live debugging of the application process or capturing detailed information about the processes involved from the computer memory at failure time for post-mortem analysis at a server.
Corrupt or malfunctioning hardware components may produce an error which may be difficult to identify quickly. For example, the process data captured at a client system and forwarded to an error analysis server may not contain enough information to diagnose hardware failures at the time of a crash. Further, server or “backend” analysis incurs the additional delays involved with collecting data at the client, sending the data to the analysis server, conducting the analysis, and returning data or instructions to the client.