Deterministic application record and replay is the ability to record application execution and deterministically replay it at a later time. This record-replay capability can provide many potential benefits related to systems development, testing, and maintenance.
For instance, the record-replay capability can be used for diagnosing and debugging applications by capturing and reproducing hard-to-find bugs. In addition, it can be used for intrusion analysis by capturing intrusions involving non-deterministic effects. It can also be used for implementing fault-tolerance by providing replicas that replay execution and go live at the occurrence of a fault in place of the previously running application instance.
Given such potential benefits, different approaches have been considered and devised for providing the record-replay functionality. These previous approaches, however, have certain limitations, including the inability to provide low-overhead record-replay for unmodified applications on commercially available, unmodified multi-processor systems and operating systems.
For example, hardware approaches require hardware modifications and thus do not work on commercially available, unmodified hardware. Virtual machine approaches, for example, provide transparent record-replay, but incur high overhead on multiprocessor systems. Operating system approaches, for example, require extensive kernel modifications and are unable to capture many non-deterministic events during recording of application execution. Programming language mechanisms, for example, do not support applications written in languages that do not provide record-replay primitives. Furthermore, application-level and library-level approaches, for example, provide record-replay for only some applications and require application and library modifications.