Computer programs have been known to fail or crash from time to time despite extensive preventative measures. This is not a recent development as most every computer user has likely experienced program failure in some form at an inopportune time. Programs can fail for a myriad of different reasons. For example, programs may prematurely fail if they violate memory access conditions, if they contain bad pointers, or due to the unavailability of referenced data stores or other computer devices. Furthermore, programs can simply crash due to logic failures introduced by a programmer during coding. Upon failure, conventional programs inform the user of the failure and terminate the application. In some cases, the memory contents prior to the failure will be provided to a user to be used in determining the cause of the failure. Additionally, the user may be given the opportunity to provide failure information to the application vendor, which can be utilized to facilitate generation of a patch or revised code to prevent such failure in the future. In any event, the program will be terminated and must be restarted from the beginning.
Restarting a program can involve re-executing a significant amount of code. Consider a workflow application, for example. A workflow application implements some business process. The process is implemented as a series of steps that are executed sequentially. These steps can involve a plurality of tasks some of which can be lengthy in the time it takes to complete. It the application was to fail for some reason, conventionally, the application execution would need to be restarted from the beginning, namely the first instruction or task. This is costly in terms of execution time as well as efficiency. For example, a step in a workflow application could involve reading in a large amount of data from some data center, perhaps to build a data warehouse. This step alone could take hours to complete. Subsequently, a much less time consuming task could fail for some trivial reason. At this point, the workflow process application would need to be restarted from the beginning, which would again necessitate reading in data from the data center, among other things.
It should be appreciated that this problem is not exclusive to workflow applications. Workflow application as well as any other application can be programmed utilizing any one of a myriad of programming languages. Any conventional program regardless of its problem space has the potential to fail despite developers' preventative efforts. Moreover, once a program has failed, it must start again anew from the beginning, if it is to be re-executed.
Many recovery techniques have been developed and employed to mitigate the effects of failure. For example, data can be saved or backed up often either manually or automatically. Conventional word processing programs employ this technique so that upon failure a user does not loose all their data and can resume work on a previously saved document. However, the word processing program will terminate on failure and if restarted, the program must begin execution over from the first instruction. Similarly, some computer systems provide a mechanism for restoring the system upon failure by storing copies of critical system files. These files can then be restored after failure by replacing current files with the previously saved backup files. This is often employed to facilitate recovery from a complete system failure due to accidental or malicious file deletion or corruption.