1. Field of the Invention
The present invention is directed to a system for copying a program state while the program continues to run and, more particularly, to a system that duplicates a page in memory that is about to be modified so that it can be saved in a condition before it is modified.
2. Description of the Related Art
For some computer programs if they fail before producing a solution to the problem that they are working on it can cost the user a great deal. This typically is the case when the program needs to run for several hours or days before a solution is reached. This cost comes from things such as lost time because the job needs to be completely run again and the price of electricity and for buying computing time on a machine can be expensive. One solution to this problem is a technique called Check Point Restart (CPR). Periodically the user pauses the execution of the program and a copy of all its state is made before resuming execution. If the program or computer crashes after the copy is made, execution can be restarted at the point when the copy was made instead of having to start again from the beginning.
For some programs the process of making a copy of their state can be prohibitive. These programs typically use a large amount of memory. Copies of all the memory must be made to nonvolatile storage before the program can resume. Because the memory is large but disks speeds are usually slow, it can take a long time to make the copy and the program must be stopped for an appreciable amount of time.
Another practice for CPR is for the computing program to be paused while all of its data is copied onto the storage device. This technique allows it and the utility program to execute simultaneously without risk that the computing program will modify its data before the utility program has a chance to copy it.
Another practice is for computing program to call an operating system routine such as UNIX's fork( ) or the equivalent function. This routine makes an exact duplicate of the program, including the process state, file descriptors, program memory space and program data space. The state of the duplicate is then copied to the storage device while the original continues execution. As part of the fork function, the operating system avoids making a full copy of the memory data by using a practice known as Copy On Write (COW). The duplicate process's program data space is shared with the original process space until either process modifies a memory page. At such time the page is duplicated with the Copy On Write. The fork process is slow, as the operating system must make a full copy of the virtual to physical memory mapping data known as page tables, regardless of Copy On Write pages.
What is needed is a system that improves upon these situations.