Commercial entities record in a database information regarding many facets of matters in which they are involved, e.g., transactions in which they engage. Typically, the entities' internal policies or regulations of governing regulatory bodies aimed at improving the entities' accountability require reporting of such recorded data at particular time intervals or even at any random time. Example regulations are those promulgated in the Sarbanes-Oxley Act of 2002, in particular section 302, and those proposed by Basel II. To comply with such regulations and to enable auditing and investigation of the entities' historic data, the entities archive their databases. Such archiving provides a view of a state of a database at a particular time.
Conventional database archiving methods include backup archiving and incremental archiving, both of which are inefficient with respect to monetary cost, temporal cost with respect to processing time, and/or reliability.
For backup database archiving, a snapshot of a database at a particular time is taken and stored. The snapshot is a copy of the database at the time it was taken. Accordingly, reference to the snapshot provides a view of the database state at that time. Since many such snapshots are required, multiple copies of the database is stored, each providing a snapshot of the database at a different time. The snapshots are usually stored on dismountable media, such as tape. A large database can require tens or hundreds of tapes for a single snapshot, and multiples thereof for more than one snapshot, which is costly. Further, since tape is prone to failure, many entities duplex their snapshots, doubling the cost. Further, much processing time is required for recording the database snapshots. Finally, as new database software releases are produced and implemented by the entities, the entities must continue to support older versions of the database software in order to view the database snapshots stored according to the older database software versions.
Incremental database archiving is implemented to mitigate some of the media costs incurred in storing database snapshots. For incremental database archiving, a single snapshot is stored along with a log of database updates to be applied for obtaining a second snapshot at a second time. Another log of updates from the second time is stored for obtaining a third snapshot at a third time based on (a) the phantom second snapshot which must first be restored based on the first log of updates and (b) the second log of updates, etc. For obtaining a snapshot of the database state at a time subsequent to the time to which a previous snapshot corresponds, the logged updates up to the time of the required snapshot are applied to the previous snapshot. Even this method requires expenditures on a significant amount of the dismountable media for the initial snapshot and the update logs. Further, the probability of failure increases exponentially for each additional incremental update upon which a snapshot is based. If the required snapshot is based on c incremental logs, and the probability of failure of a snapshot based on a single increment of the log is p, then the probability of successfully obtaining the required snapshot is (1-p)c. For example, if there is a 5% probability of media failure for each log increment, and the required snapshot is based on a chain of 10 update log increments, then the probability of successfully obtaining the required snapshot is 0.9510=60%.
Accordingly, there is a need in the art for a method, system, and database archive for increasing efficiency of database archiving while complying with internally or governmentally promulgated archiving regulations.