1. Field of the Invention
The present invention relates to the storage and maintenance of data in databases. More particularly, the invention concerns the use of a primary database's log records to update a secondary database that has a different encoding scheme than the primary database.
2. Description of the Related Art
Databases & Log Records
Computers are especially useful to store, search, and analyze large amounts of data, too immense for hand recording. Computer databases typically store data using a nonvolatile storage device such as magnetic disk drive, magnetic tape, and/or various optical storage formats.
Adding data to a database is non-complicated, since the data can simply be appended to the database. However, it is more complicated to change data, insert data, or delete data. These functions are aided by using log records. One function of log records is to store "updates" to the database, where one or more data records are changed. Log records also store "insertions" to the database, where one or more data entries are inserted into the database. Likewise, log records store "deletions" to the database, where one or more data entries are deleted from the database. As an example, one well known logging technique is "write ahead logging".
Another beneficial component of most databases is a buffer, comprising a fast-access volatile memory that first receives any changes to the database. In some systems, such as DB2 databases, changes stored in the buffer are not brought to the log record until the user issues a "commit" instruction. At this point, the buffered database changes are stored in the log record and deleted from the buffer. Such a buffer may be known as a "log buffer".
Data Compression
In addition to log records, data compression is another feature that improves the efficiency of databases. Rather than simply storing data exactly as received from a user, data can be stored in a compressed format. Often, this compression is achieved by substituting shorter codes for lengthier data that frequently occur in the database. As a simple example, each occurrence of the address "1000 Maple Street" may be represented in the database by "*". The stored database is therefore considerably shorter, since each occurrence of "1000 Maple Street" is reduced to "*". Translations between expanded data and compressed codes are stored in a "compression/decompression dictionary".
Data compression is frequently applied to the database as well as its log records. The database and the similarly compressed log records are entirely compatible with each other, both exhibiting reduced storage space by virtue of their compression.
A number of situations can arise, however, where log records of one format must be applied to data compressed with a different format. One situation occurs when reorganizing data by creating a separate reorganized copy of the data, called a "shadow" copy. If the shadow copy is created with a different encoding scheme, compatibility problems arise when applying log records of the original copy to update the shadow copy. A similar compatibility problem occurs in systems that maintain synchronized data replicas having different encoding schemes. Still another example is the use of log records having one format to recover data from an image copy that was originally made when a different compression format was in effect.
As shown above, a number of compatibility issues result from copying, reorganizing, or otherwise modifying an original database while continuing to log changes to the original database. Many systems address this problem by simply taking the data off-line, modifying it as desired, rebuilding the compression/decompression dictionary, and then rewriting the data to storage. This is an extremely time consuming process, however, and many database management systems ("DBMSs") cannot tolerate such lengthy periods of data unavailability. For example, data must be available constantly in DBMSs such as ATMs, catalog sales, stock brokerage, businesses with data users spanning many time zones across the world, etc.
For these DBMSs, any database modification must be performed while the original data remains on-line, and changes are logged to the first database. These systems, then, are still perplexed when attempting to use the first database's log records to update the non-compatible second database.