This invention relates generally to databases for bulk digital data storage and retrieval, and more particularly to data storage in mirrored data warehouse databases.
A data warehouse database is a repository of an enterprise's digital stored data that provides an architecture for data flow to support operational systems such as online transaction processing (OLTP). Data warehouse databases generally have very large database sizes, and experience very high volumes of bulk loads. To provide high database service availability, database mirroring software has been employed whereby a primary database maintains a second database copy in a mirror database that is kept up to date to be capable of taking over processing in the event of failure of the primary database. The primary database has also been responsible for catching up or re-synchronizing the mirror database when it has been temporally down or network communications have temporally been lost. An important measure of database service availability is the time it takes for a mirror database to take over processing once a failure of the primary database has been detected. This time is referred to as the mean-time-to-repair (MTTR). During takeover, no service is available because the primary database is down and the mirror database has not taken over service.
Conventional database software mirroring requires that a transaction log which records transaction changes to the primary database be shipped from the primary database to the mirror database, and that ongoing redo of transactions in the mirror database be performed by the mirror database by sequentially reading transactions in the transaction log and redoing the changes reflected in the log in its mirror database. This conventional approach which is referred to as “log shipping” requires that the mirror database finish its redo application (catch-up) before it can take over processing. For data warehouses, such ongoing database redo is too slow and inefficient for such large systems. During a transaction session, parallel work sessions may be occurring on the primary database which can be doing input-output (I/O) on database pages to be modified. In contrast, the mirror database must sequentially read the transaction log in order to apply changes to its database, and cannot do physical I/O in parallel during this process. Thus, the mirror database may fall behind during high volumes of bulk loads, and unacceptably increase the takeover time and the time during which service is unavailable.
A known performance enhancement for large database transaction loads is to bypass the transaction log, also known as the Write-Ahead-Log (WAL), and bulk load (write) the changes directly to database files. The advantage of bypassing the WAL is that the shared memory database page cache is not polluted with new pages, and the data is not written twice, i.e., once to the transaction log and a second time to the database file by a background writer. However, this cannot be done in a mirrored database arrangement since mirroring is performed using log shipping and bypassing the WAL cannot be done. It would also reduce performance because all bulk load changes would need to go through the WAL.
It is desirable to provide database mirroring solutions which address the foregoing and other problems of known approaches, by improving mirror database pair performance and reducing the mirror takeover processing time.