Given that main memory is becoming cheaper and larger, new data formats are needed to speed query processing when data is stored in memory. Existing formats are designed for disk and, when stored in memory (e.g. in the buffer cache), the formats are not optimal for queries. For example, it is common for database systems to store data persistently in “disk blocks”. Typically, within each disk block, data is arranged in row-major format. That is, the values of all columns of one row are followed by the values of all columns for the next row.
To speed up performance, some of the disk blocks may be cached in a “buffer cache” within volatile memory. Accessing the data from volatile memory is significantly faster than accessing the data from disk. However, even within the volatile memory, the data is still in the format of row-major disk blocks, which is not optimal for certain types of database operations.
In contrast to row-major disk blocks, columnar formats have many attractive advantages for query processing in memory, such as cache locality and compression. Consequently, some database servers now employ new table types for persistently storing data in column-major formats. In column-major format, the data may be read into volatile memory where it can be used to process certain queries more efficiently than would be possible if the data were stored in row-major disk blocks.
Unfortunately, the task of migrating existing databases that persistently store data in row-major disk blocks to use of the new column-major table types is not trivial. Further, after performing such a migration, query processing will be less efficient for the class of queries that can be performed more efficiently on data that is stored in row-major disk blocks.
As an alternative, some database systems keep the data in row-major disk blocks, but employ column store indexes. Column store indexes do not replace existing tables, and therefore do not require the entire database to be migrated to new table structures. Rather, column store indexes act more as a traditional secondary index. For example, such column store indexes are still persisted to disk. Unfortunately, a significant amount of overhead may be required to maintain such indexes as updates are performed on the data indexed thereby.
As yet another alternative, one may replicate a database, where a first replica of the database stores the data in conventional row-major disk blocks, while a second replica stores the data in a column-major format. When a database is replicated in this manner, queries that are most efficiently processed using row-major data may be routed to the first replica, and queries that are most efficiently processed using column-major data may be routed to the second replica.
Unfortunately, this technique does not work well due to the lag that occurs between replicated systems. Specifically, at any given point in time, some changes made at one of the replicas will not yet have been applied to the other replica. Consequently, the lag inherent in the replication mechanism may result in unpredictable artifacts and, possibly, incorrect results.
Further, each transaction generally needs to see its own changes, even before those changes have been committed. However, database changes are not typically replicated until the changes have been committed. Thus, a transaction may be limited to using the replica at which the transaction's uncommitted changes were made, even though the format of the data at the other replica may be more efficient for some operations.
The replicated data, such as those in column-major format, is periodically refreshed to maintain synchronicity with the primary data, such as those in row-major format. Refreshing replicated data may comprise rewriting entire columns-worth of data during a dedicated maintenance window time period to maintain the columnar representation. Not only is a significant amount of data updated during a refresh cycle, but refresh cycles need to be repeated at least periodically because replicated data may be subject to continuous change during online transaction processing (OLTP). Thus, each refresh cycle consumes considerable resources and the column stores are made unavailable for OLTP.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.