Certain terms used in the “Background of the Invention” are defined in the “Definitions” section below.
1. Computer Applications
Much of our daily lives is augmented by computers. The many services upon which we depend, our banking, communications, air and rail travel, online shopping, credit-card and debit-card purchases, mail and package delivery, and electric-power distribution, are all managed by computer applications.
In its simplest form, as shown in FIG. 1, a typical computer application is generally implemented as a computer program (1) running in a computer (2). A computer program is basically a set of computer-encoded instructions. It often is called an executable because it can be executed by a computer. A computer program running in a computer is called a process, and each process has a unique identification known to the computer. Many copies of the same computer program can be running in a computer as separately distinguishable processes.
An application typically includes multiple interacting processes.
2. Application Database
With reference to FIG. 1, an application often depends upon a database (3) of information that the application maintains to record its current state. Often, the information in the database is fundamental to the operation of the application, to the decisions it makes, and to its delivery of services to the end users.
The database may be stored in persistent storage such as a disk for durability, it may be stored in high-speed memory for performance, or it may use a combination of these storage techniques. The database may be resident in the same computer as the application program, it may be resident in another computer, it may be implemented as an independent system, or it may be distributed among many systems.
A database generally includes one or more files or tables, though it may be just a random collection of unorganized data. Each file or table typically represents an entity set such as “employees” or “credit cards.” A file comprises records, each depicting an entity-set member such as an employee. A table comprises rows that define members of an entity set. A record comprises fields that describe entity-set attributes, such as salary. A row comprises columns that depict attributes of the entity set. In this specification, “files” are equivalent to “tables;” “records” are equivalent to “rows;” and “fields” are equivalent to “columns.”
3. Requests
With further reference to FIG. 1, incoming end users (4) generate requests (5) to be processed by the computer application. End users may be people, other computer applications, other computer systems, or electronic devices such as electric power meters. In this specification, the term “end user” means any entity that can influence an application and/or can request or use the services that it provides.
An example of an incoming request from an end user is a request for a bank-account balance.
Another example is an alert that a circuit breaker in a power substation has just tripped. In some cases, there may be no incoming request. For instance, a computer application may on its own generate random events for testing other applications.
4. Request Processing
As shown in FIG. 1, the application receives a request from an incoming end user (5). As part of the processing of this request, the application may make certain modifications to its database (6).
The application can read the contents of its database (7). As part of the application's processing, it may read certain information from its database to make decisions. Based on the request received from its incoming end user and the data in its database, the application delivers certain services (8) to its outgoing end users (9).
5. Services
A service may be delivered by an application process as the result of a specific input from an end user, such as providing an account balance in response to an online banking query. Another example of a service is the generation of a report upon a request from an end user.
Alternatively, the application program may spontaneously deliver a service, either on a timed basis or when certain conditions occur. For instance, a report may be generated periodically. Alternatively, an alarm may be generated to operations staff if the load being carried by an electric-power transmission line exceeds a specified threshold.
The end users providing the input to the application may or may not be the same end users as those that receive its services.
6. Availability
The availability of a computer system and the services it provides is often of paramount importance. For instance, a computer system that routes payment-card transactions for authorization to the banks that issued the payment cards must always be operational. Should the computer system fail, credit cards and debit cards cannot be used by the card holders. They can only engage in cash transactions until the system is repaired and is returned to service.
The failure of a 911 system could result in the destruction of property or the loss of life. The failure of an air-traffic control system could ground all flights in a wide area.
In mission-critical systems such as these, it is common to deploy two or more computer systems for reliability. Should one computer system fail, the other computer system is available to carry on the provision of services.
7. Redundant System
The availability of a computing system can be significantly enhanced by providing a second system that can continue to provide services to the end users should one system fail. The two systems are often configured as an active/backup system or as an active/active system, although other configurations are also possible. The systems are interconnected via a computer network so they can interact with each other.
In an active/backup system (FIG. 2), the active system is keeping its backup system synchronized by replicating database changes to it so that the backup system is ready to immediately take over processing should the production system fail.
In an active/active system (FIG. 3), both systems are processing transactions. They keep each other synchronized via bidirectional data replication. When one system processes a transaction and makes changes to its database, it immediately replicates those changes to the other system's database. In that way, a transaction can be routed to either system and be processed identically. Should one system fail, all further transactions are routed to the surviving system.
8. Non-Redundant System
In some environments, a second system is not immediately available. Rather, if there is a failure of the computer system, a second system is procured and the application and data are loaded onto it.
9. Online Backup of an Active Database
All of the services that a system provides to its users are generally determined by the data it is processing and may have stored. It is therefore imperative to protect that data from loss due to hardware failure, software failure, human error or malfeasance, or any other fault condition. For example, if an operator accidentally deletes an important file, that file will disappear from the primary system and, if replication is configured and operational, the replication engine will dutifully delete the file from the backup system.
Consequently, it is common practice to periodically back up the data onto a medium such as magnetic tape, virtual tape (magnetic tape images typically stored on disk, which may be remotely located), cloud infrastructure, solid-state storage, or other persistent storage as shown in FIG. 4a (1). Throughout this specification, the use of the phrase ‘tape’ for the backup copy medium is meant to include all of these storage medium locations and technologies and is not meant to limit the reference to just classic electronic tape technologies. The use of the word ‘tape’ implies a persistent storage device.
Such a backup is commonly known as a ‘dump’ of the data. Generally, a backup is taken of a source database (2) while it is actively supporting transaction processing (3). Thus, the source database is changing as the backup takes place. This is known as an online backup or an “online dump” (4). The source database (2) is thus an “online database.”
The problem with an online backup is that it takes time to complete, and changes are occurring to the database during this time. Changes to the database are captured only for data that has not yet been written to the backup medium. Data written early in the backup phase is missing subsequent changes, but data written later in the backup contains more of the changes. The data in the backup is therefore inconsistent.
10. Database Restore from an Online Backup
In order to restore a consistent (e.g., from a relational perspective, logically complete and usable to applications) database on a target system, the changes that are occurring during and following the backup must be written to a persistent change log such as an audit trail, a redo log, a journal, or equivalent data structure. In FIG. 4a, the oldest changes have been written to Change Log 1 (5) and the newest changes to Change Log 4 (6).
The restore process is shown in steps (7) and (8). This typically involves marking the persistent change log (or backup copy) via various methods to note the time or relative position in the change log at which the backup began (7). The database is restored onto the target system to create the “restore database” (interchangeably referred to as the “target database”) by loading the backup copy (or “online backup”), or dump, from the persistent storage device onto it, and the pertinent change logs are rolled forward (8), in order, to apply the changes that occurred after the backup started in order to make the target database consistent and complete. This sequence is usually executed without any user access to the data while it is being restored and rolled-forward, often because the data is old and out of date, it may be inconsistent, and the restore operation can often be performed much faster using bulk apply methods when no user access is allowed.
FIG. 4b describes in more detail a typical restore process to further show the sequence of steps used to create the restore database, and then roll it forward to bring it consistent and complete. In this figure, the backup copy is copied or loaded (2) from the persistent storage device (1) onto the restore database (3), and then the change logs are rolled forward (4), in order, to apply the changes that occurred after the backup started in order to make the target database consistent and complete. Note that since change log 1 (4) was created before the beginning of the backup (5), it would typically be skipped (not rolled forward). (If it is rolled forward, it will typically just make the target database inconsistent while it is being rolled forward.) Change logs 2, 3, and 4 (6) were created after the backup was originally started/taken, so they would normally be rolled forward (7) and applied into the target database (restore database) to make the target database consistent and complete.
In FIG. 4a, the pertinent change logs are Change Logs 2, 3, and 4 (Change Log 1 was created before the backup began, and its changes are already reflected in the source database, and were captured by the backup operation, at the time the backup began). Therefore, in FIG. 4a, once the backup copy has been loaded onto to the target database, the changes in Change Logs 2, 3, and 4 must be applied to the target database to bring it current and to a consistent state—consistent and current at least to the time that the backup operation ended as additional changes are likely being made to the source database after the backup ended.
A problem with this technique is that several change logs may be required to hold the changes that occurred during the backup. For a very active source application with many changes occurring per second, there may be many such change logs required to hold all of the changes that occurred during the backup.
For instance, as shown in FIG. 4a, Account 374 is initially backed up with an account value of $10. This change was made in log file 1, which occurred before the backup began. Account 374 is subsequently updated to $74, then $38, and finally to $92; this sequence is reflected in the log files. These will also be the values that are applied to Account 374 as the roll forward takes place. More specifically, the restore will write out the initial value of account 374 from when the original backup was taken ($10). The log files will then be replayed in succession, starting with log file 2, then log file 3, then log file 4 as shown in FIG. 4a. Unfortunately, this will replay the old values for this account before ultimately ending at the correct account value of $92.
Furthermore, as shown in FIG. 5, many of the changes that occur during the backup operation may have been captured already by the backup if they occurred after the backup operation started but before those particular data objects (or part of the database) were copied to the backup medium. These changes are thus a duplicate of data that has already been backed up. Worse, there could be a series of changes to the same data that occurred after the backup began but before that data was subsequently backed up, and rolling forward thru those changes will actually cause the restored data to reflect older (and inconsistent) values while it is being rolled forward, as shown in FIG. 5. For instance, as shown in that figure, Account 374 starts off at $10 when the backup starts, is updated to $74, then $38, and finally to $92; however, it isn't backed up until its value is $38, represented by the change that was captured in log file 3.
Using the prior art method of restore and roll forward, account 374 will initially be restored as $38, but then it will be updated to old account values ($74 in log file 2, then $38 in log file 3, then $92 in log file 4) while all of the log files are processed.
Consequently, restoring a backup requires rolling forward through several change logs, which may take a great deal of time and consume a great deal of storage medium resources for all of the change log files. Furthermore, rolling forward thru all the changes that occurred during the backup will make the restored data out-of-date and inconsistent until the final set of changes are replayed from the log file(s).
Additionally, during this process, the source database is still being updated; and these changes must also be logged and rolled forward to update the restored backup to a current and consistent state as of the time period when the backup operation ended.
11. Oracle Snapshots
Oracle uses snapshots to replicate data to non-master sites in a replicated environment. A snapshot is a prior art replica of a target master table from a single point in time. Snapshots are updated from one or more master tables through individual batch updates.
A snapshot allows one to go back in time to earlier values in the database—specifically at the snapshot time. To do this, the current database is basically UNIONed with (or overlaid by) the snapshot database, using the values from the snapshot database to replace the current values in the current database. In this approach, the snapshot database holds the original value (as of the snapshot time) of the data element(s) that have subsequently changed over time. Note that new data elements added after the snapshot time will not be reflected in the snapshot image of the database, and data elements removed since the snapshot time will still be reflected in the snapshot image of the database.
Note that whereas the snapshot approach allows one to go back to prior values of the database, the current invention allows one to go forward from the prior (and possibly inconsistent) values of the database to the latest (and optionally consistent) values of the database.
12. What Is Needed
What is needed is a means for backing up a database and for restoring it to a consistent state (for instance, every child row has a parent row) with a minimum of change logs so that the restoration of a backup can be executed as quickly as possible, the change logs can consume as little of the storage medium as possible, and the target data is kept as consistent as possible while the restore is taking place.
Since there may be additional changes made to the source database after the time that the backup operation ended, it is also desirable to save these additional changes so that the backup information can continue to remain consistent, complete, and current as of the current state of the source database from that point forward. This is referred to as creating and maintaining a “Continuous Backup”.