Despite developers' best efforts to release high quality code, deployed software inevitably contains bugs. When failures are encountered in the field, many applications/programs record their state at the point of failure, e.g., in the form of a core dump, stack trace, or error log. That snapshot is then sent back to the developers for analysis. Perhaps the best known example is the Windows Error Reporting™ framework, which has collected over a billion error reports from user programs and the kernel.
Unfortunately, isolated snapshots only tell part of the story. The root cause of a bug is often difficult to determine based solely on the program's state after a problem was detected. Accurate diagnosis often hinges on an understanding of the events that preceded the failure. For this reason, some systems have implemented program replay. These frameworks log enough information about a program's execution to replay it later under the watchful eye of a debugging tool. However, these systems require a specially instrumented execution environment like a custom kernel to capture a program's execution. This makes them unsuitable for field deployment to unmodified end-user machines.