Historically, computer systems have stored data on a storage medium, such as in a file or database stored on a hard disk drive. However, such a storage medium is vulnerable to data loss due to a data corruption event, such as a physical failure of the hard disk or a power failure that suddenly shuts down the computer system. For example, such an event can corrupt the stored data by interrupting the writing of a block of data to a hard disk. Such a data corruption event can also cause a data file to be updated inconsistently, because a data change affecting different parts of the file is not applied to all of the parts of the file due to the event. Thus, some parts of the file are updated, and some parts that should be updated are not. In addition, a power surge due to a shutdown of the computer system, lightning strike, or problem in the external power source (e.g., electrical power utility) can cause an electrical signal or surge that results in data loss to different parts of the storage medium, such as a hard disk, resulting in a data file that is no longer valid.
Computer systems have mechanisms which guard against losing data. In one conventional approach, the computer system relies on a backup system to back up data files from the storage medium. Typically, the backups occur based on a predetermined time period or regular schedule, such as a daily backup. The backups can be a complete backup of all files on the storage medium, or can be an incremental backup. For example, an incremental backup system backs up only those data files that have changed within the time period.
In another conventional approach, an operating system provides for file system journaling for operations such as purging files or modifying directories, which is useful when there is an unexpected shutdown of the operating system or computer, or some other problem occurs that affects the state of the file system. In a further conventional approach, a computer system, an application executing on the computer system, or a database maintains a transaction log, which can be used to reinitiate the transactions interrupted or never initiated due to a data corruption event.
In the case of a failure, the conventional backup approach typically requires that a backup tape be located, often by a human operator, loaded on a tape drive, and read by the computer system to locate the files that have been corrupted. This approach can be time consuming and subject to failure if the backup tapes themselves fail for some reason, or human operators do not run the backup tapes reliably. The recovery process can be complicated when incremental backups are used, because, if a substantial number of files are lost, then several different incremental backup tapes may have to be located to recover all the files that were compromised.
In conventional approaches, such as those using a log or journal, there is typically no guarantee that the log or journal itself is not corrupt, or that a saved or retrieved version of a database is not also corrupted by the corruption event. Thus, in conventional approaches, a corrupt log or journal may be used to update a valid previous version of a database, leading to an invalid update to the database. Alternatively, in conventional approaches, a valid log or journal may be applied to a corrupt previous version of a database, also leading to an update to the database that is invalid.
The problem of recovering a corrupted database is more acute for computerized devices that may be shut down routinely by a power disconnection or other means. For example, this problem often applies to a network of devices, such as a router or other devices used in a content distribution network (CDN). Such network devices typically maintain a database (or hash table) including an identifier and configuration information for other similar devices on a network. The devices may be subject to sudden shutdowns because users expect to be able to disconnect the power cord, move or service the device, and reconnect the device as needed, without performing a backup procedure, or checking to see if a backup or journaling system is working properly. Such network devices typically are computers that do not provide such user-oriented input/output devices as graphic displays or keyboards for human users that would allow easy access to perform manual checking of backup or journaling systems. Thus, there is a need for a robust automatic recovery system designed to maintain such a database or hash table in such a device and enable rapid recovery of a valid version of the database if it is corrupted by a sudden shutdown or power surge.
In contrast, the invention is directed to techniques for modifying a database based on journals that include operations to be performed on the database. The journals enable verification of the validity of the operations prior to modifying the database in order to prevent corruption of the database due to the processing of an invalid operation. Furthermore, the journals enable recreation or recovery of the database using an older version of the database and archived journals.
In one arrangement, a database manager functions on a computerized device, such as a network device, to provide a robust recovery system for a database accessed by the computerized device. The database manager receives operations to be performed on the database, such as a write operation that enters a new data value or modifies an existing data value in the database, over a network or from some other source. The database manager enters the operations as operation records in a journal and generates an error detection value, such as a message digest, that can be used to check the validity of each operation record. The operation records are entered in the journal in the sequence to be used when applying the operation records to modify the database. After a predetermined number of entries have been made (or after a preset time period), the database manager copies the existing database to a new version of the database and modifies the copied database based on the operation entries from the journal. The database manager then generates an error detection number for the modified version of the database and clears the entries from the journal, so that a revised version of the journal can be started. The database manager then adds additional operation records to a revised version of the journal, makes a new copy of the modified database, and modifies the copy of the modified database with the additional operation records. The database manager then continues a process of clearing the journal, producing new versions of the journal with additional operation entries, and producing new modified versions of the database.
If there is a data corruption event, the database manager can check the validity of earlier versions of the database by using the error detection number for each database. The database manager can thus determine the most recent valid version of the database. Assuming, for example, that the current database is invalid, then the database manager checks the different versions of the journals and applies the versions of the journals to the most recent valid version of the database to produce a current, valid version of the database. The database manager checks the validity of each operation entry in each journal by checking the error detection number for each operation entry. Thus, the database manager uses the verified operation entries to revise the most recent valid version of the database until encountering an operation entry that is not valid, as indicated by the error detection number for that operation entry. The database manager can then generate an error signal indicating that the database has been partially recovered, and provides the signal to the computerized device or over the network to some destination, such as a network monitoring computer operated by a human operator. Then, for example, the monitoring computer or human operator can determine if the partial recovery is sufficient or locate an archived or backup copy of the database if one is available.
In one embodiment, the invention is directed to a method in a computer system for updating a database. The method includes entering operation entries in a sequence in a journal, copying the first database to a second database, and modifying the second database by applying the operations defined in the operation entries in the sequence indicated by the journal to the second database. The operation entries define operations suitable for modifying a first database. Thus, the state of the first database is preserved and a database manager can save the first database, for example to an archive containing different versions of the database.
In another embodiment, the method includes generating an error detection value for each operation entry, entering the error detection value in the journal, and verifying the validity of each operation entry based on the error detection value for that operation entry. For example, the error detection value is a one-way hash value that provides a value based on the operation entry, and which is different for each operation entry (to a high level of probability). Thus, a database manager or other program can verify the validity of each individual operation entry.
The method includes, in another embodiment, providing a message digest for each operation entry. Thus, the error detection value is a message digest, such as an MD5 message digest.
In a further embodiment, the method includes generating a first error detection value for the first database and, after modifying the second database, a second error detection value for the second database. Thus, each version of the database has its own error detection value that can be used at a later point in time to verify the database.
In another embodiment, the method includes verifying the validity of the second database based on the second error detection value and, if the second database is valid, generating a first output indicating that the second database is valid. The method also includes, if the second database is not valid, verifying the validity of the first database based on the first error detection value, copying the first database to a third database, modifying the third database by applying the operations defined in the operation entries in the sequence indicated by the journal to the third database, and generating a second output indicating that the second database is not valid and the third database is valid. Thus, the database manager can use the error detection value to verify a version of the database and check whether that version is valid or not.
In an additional embodiment, the method includes clearing the operation entries from the journal in response to one of the first output and the second output. Thus, the database manager, or another program, clears the journal after an indication that there is a valid version of the database. The computer system or database manager can now provide additional operation entries for the journal, as new operation entries are received or generated.
In a further embodiment, the method includes providing a first message digest for the first database and, after the step of modifying the second database, a second message digest for the second database. Thus, the error detection value for each database is a message digest.
In another embodiment, the method includes clearing the operation entries from the journal to produce an empty version of the journal. The method also includes providing saved journals by repeating, as long as additional operation entries are received, steps (i) through (iv), as follows: (i) producing a revised version of the journal by entering the additional operation entries in the empty version of the journal, (ii) generating a revised version of the database based on modifying a copy of a current version of the database based on the additional operation entries retrieved from the revised version of the journal, (iii) saving the revised version of the journal to one of the saved journals having an identifier that identifies the saved journal uniquely in comparison to other saved journals, and (iv) removing the additional operation entries from the journal to produce the empty version of the journal. The database manager, or other program, continues the process of receiving additional operation entries, adding the entries to a journal, and updating additional versions of the database.
In another embodiment, the method includes selecting one of the saved journals based on the identifier for that journal and modifying a retrieved version of the database based on retrieving the additional operation entries from the selected saved journal. Thus, a saved journal can be used to update the database based on retrieving an earlier version of the database.
In some embodiments, the techniques of the invention are implemented primarily by computer software. The computer program logic embodiments, which are essentially software, when executed on one or more hardware processors in one or more hardware computing systems cause the processors to perform the techniques outlined above. In other words, these embodiments of the invention are generally manufactured as a computer program stored on a disk, memory, card, or other such media that can be loaded directly into a computer, or downloaded over a network into a computer, to make the device perform according to the operations of the invention. In one embodiment, the techniques of the invention are implemented in hardware circuitry, such as an integrated circuit (IC) or application specific integrated circuit (ASIC).