At times, it is useful for a database management system (DBMS) to generate and maintain data that is derived from database data, such as metadata that describes particular portions of tables stored within a database, data that is stored in a different format than the database data, etc. For example, for each one megabyte (MB) of a given table in a database, the DBMS that manages the database derives metadata that indicates particular aspects of the data in that one MB, e.g., the max and min values for a given column in the table. This metadata is computed as data loads and updates and is also computed as the DBMS scans the table in connection with responding to queries over the table. The DBMS stores, e.g., in main memory, this derived metadata as a data summary in a “derived cache” that is associated with the database data from which the data summary is derived.
The DBMS utilizes derived caches associated with a particular table to speed up processing of queries that run over the table. In the context of a data summary that includes min and max data for a particular column of a table, the DBMS uses the min and max data from the data summary to determine whether the portion of the table associated with the min and max data includes information that is required by the query.
For example, a particular query selects rows from a table T that includes a column A, where the value of column A is less than five. During execution of this query, the DBMS determines, from a derived metadata summary stored for a particular portion of table T, that the min value of column A within that portion of the table is 10. As such, none of the rows within the portion of table T associated with this derived cache are selected by the query, and the DBMS need not scan the rows in that portion of table T in order to execute the query. In this way, the DBMS uses a derived cache to prune input/output (I/O) operations from the query execution, specifically, I/O operations on the portion of table T that is associated with the derived cache.
Since I/O operations are relatively costly operations, the ability to prune I/O operations from query execution increases the efficiency of executing queries that involve values summarized in derived caches. Likewise, other types of derived caches speed up execution of operations over database data and, as such, increase the efficiency of the DBMS.
Generally, derived cache data is built based on queries and other operations (such as data loads and updates) that have been run over a particular instance of data, i.e., from which the derived cache is derived. As such, data that has been newly replicated or relocated does not have the benefit of derived cache data to increase the efficiency of operations over the data. It would be beneficial to make derived cache data, that is derived based on other instances of particular data, available to the DBMS in connection with other, newer, replicas of the particular data.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.