In a computer system, one of the most powerful mechanisms used to increase efficient access to data is a cache. A cache is memory that is used to store copies of data items, that are stored in a different memory or another portion of the same physical memory, for access by a computer process. The term memory refers to any resource or medium that may be used to store data, including volatile memory or non-volatile memory, such as a disk drive.
Caches come in many forms and are used in many types of systems. One of the simpler examples of a system that uses a cache is a computer that accesses disk blocks on a disk drive. The volatile memory of the computer is used as a cache for data stored on the disk. Data stored in the computer's volatile memory can be accessed more efficiently than data on a disk. In order to access a disk block on the disk, the disk block is loaded from the disk drive into a portion of volatile memory, where it is accessed multiple times by one or more processes more quickly and efficiently.
The data or data item of which a copy is stored in a cache is referred to as a source data item. The copy of the source item in the cache is referred to as a cache copy. A memory in which the source data for a cache is stored is referred to as source memory. In the above example, the source data is the disk block and the source memory is the disk drive. The copy of the source data item stored in a cache is referred to herein as a cache copy.
A cache can be accessed more efficiently than a source memory for a variety of reasons. For example, a cache could be in a memory composed of a faster medium than the medium of the source memory, as in the above example; a cache could be located in a networked computer's local memory, volatile or non-volatile, while the source memory is in a computer on the network; a cache for a computer can be the memory of a second network linked computer than can be accessed more quickly on the network than a third computer whose memory is the source memory, the network link between the computers may include a wide area network and other computers and/or network devices, each with a cache that holds data from the source memory.
In multi-processing systems, there may be many caches that hold copies for the same set of source data items. For example, in a multi-tiered architecture, in which a server at the first tier stores source data items, caches of numerous server clients at the second tier store cache copies of the source data items. There may be multiple cache copies of a single source data item in multiple caches of the second tier.
The process of managing a cache is referred to herein as cache management. Cache management includes retrieving copies of source data items and storing them in a cache, providing valid cache copies to clients that request copies of a source data item, and maintaining and optimizing the use of the cache. A cache management system includes software modules, which may be comprised of specialized software dedicated to managing one or more caches and may be executed by clients of a cache or servers of the source data, or a combination thereof. The software modules may be executed on multiple computer systems that participate in the cache management of multiple caches.
Clients of a cache rely on the accuracy of data provided to them from the cache, and often assume that data from the cache coheres to the source data, even as the source data undergoes changes and evolves through multiple states. A cache or cache copy coheres to its source data if the cache or cache copy is consistent with the source data according to some logic or set of rules. The condition of one or more caches being coherent with source data is referred to herein as cache coherency. One of the most important and challenging goals of cache management is achieving and managing cache coherency when the source data of a cache is constantly changing and evolving.
The most common approach to managing cache coherency is referred to herein as current coherency. Under current coherency, the rule or logic that governs whether a cache copy is consistent with a source data item is that a cache copy must be identical to the most recent version of the source data item. A cache is managed such that only cache copies that are identical with source data are treated as legitimate and coherent copies. When source data changes, the cache is changed to maintain cache coherency.
For example, a source data item of a cache copy in a cache changes. In response to the change, a cache management system performs cache invalidation to prevent the cache copy of the old version of the source data item from being used as a legitimate copy. The term cache invalidation, or invalidation, is used herein to refer to the process of preventing or restricting cache copies from being treated as coherent copies. This is usually accomplished by removing or replacing cache copies or marking cache copies as incoherent or “dirty”, preventing them from being provided to a cache client as a coherent copy of the source data. A coherent cache copy may be loaded into the cache as part of the process of cache invalidation, or in response to a cache miss, i.e. detecting that a cache does not a hold a coherent copy of a requested for data item.
Under a more general approach to cache coherency, a cache copy in a cache is associated with a coherency interval. A coherency interval is an interval for which a cache copy is a coherent copy of its respective source data. A coherency interval is usually associated with a particular version of a source data item.
For example, at time t1 a source data item S has a value v1, at time t2 a value v2, and at time t3 a value v3. A cache copy of S, S1, is associated with the interval bounded by t1 and t2. Another cache copy of S, S2, is associated with a coherency interval bounded by t2 and t3. Yet another cache copy of S, S3, is associated with an undetermined interval bounded by t3 and infinity, an end point of infinity representing that an end point of the coherency interval has not yet been fixed.
Cache clients are associated with a coherency point; the coherency of cache data for the client is based on the coherency point. For example, a client, associated with a coherency point of time t23, requests data from S. Time t23 is between t2 and t3. The coherent cache copy of S is S2, the cache copy whose coherency interval is bounded by t2 and t3. The client requests may be satisfied by data in S2.
The current coherency approach is an instance of the more general coherency approach. Under this approach, the coherency intervals associated with cache copies can be represented by a binary system, in which one binary state represents that a cache copy is coherent, and the other represents that the copy is not coherent.
The boundaries of coherency intervals are not necessarily defined by explicit times, but instead may be defined by events or the states of source data, or a combination thereof. For example, a database server typically applies changes to a database as transactions. The state of a database after applying a transaction (or a set of transactions) is referred to as a consistency state. A database transitions through consistency states as transactions are applied. The consistency states can define the boundaries of coherency intervals.
Cache invalidation under the more general approach to cache coherency involves restricting and/or establishing a boundary of coherency intervals. For example, at time t4, another version of source data item S, S4, is generated. In response, cache invalidation is performed by establishing for S3 a new coherency interval bounded by t3 and t4.
Transaction Processing
Managing cache coherency in a multi-server environment is made more complex because of transaction processing. In transaction processing, changes to a database are applied as transactions in a way that preserves four properties. These properties are referred to as ACID properties, which are defined as follows.
ATOMICITY: A transaction should be done or undone completely and unambiguously.
CONSISTENCY: A transaction should preserve invariant properties (such as integrity constraints) defined on the data. On completion of a successful transaction, the data should evolve from one consistency state to another.
ISOLATION: Each transaction should appear to execute independently of other transactions that may be executing concurrently in the same environment. The effect of executing a set of transactions serially should be the same as that of running them concurrently. This requires during the course of a transaction, that intermediate (possibly inconsistent) state of the data should not be exposed to all other transactions. Consequently, transactions must not be able to see the changes made by concurrently executing transactions until those transactions have been completed as an atomic unit and made persistent, i.e. committed.
DURABILITY: The effects of a completed transaction should always be persistent.
Under transaction processing, the data provided to a client of a database server should conform to ACID properties. To assure data is provided in this way, a snapshot approach is used. Under the snap shot approach, a client of a database server requests data from the database and makes changes to the database as a part of a transaction, herein referred to as the “active transaction”. Every version of a data item needed by the active transaction belongs to a “snapshot” of the database associated with the client. As other database transactions are committed, the database goes from one consistency state to another. A snapshot is a view of the database that is based on the particular consistency state (herein referred to as the “snapshot point”) that existed when the active transaction commenced, plus any modifications made by the reader transaction. Thus, a snapshot includes all changes that were committed to the database at a snapshot point and any modifications made by the active transaction itself, but no changes made by transactions that were not committed as of the particular consistent state. If no such version of a data item is actually stored anywhere, the version must be derived from an existing version of the data item.
Providing a snapshot requires tracking and generating lots of information. For example, a database server tracks which transactions are currently being executed and the consistency states with which they were associated when commenced, which data blocks have rows changed by which transactions, and generating records for redo and undo logs. Redo logs and undo logs contain information needed to redo changes and undo changes.
To demonstrate how a snapshot is generated, the following example is provided. Assume that a data item DATA1 has been changed by three transactions TXA, TXB, and TXC, in that order. TXA is committed before consistency state T and TXC did not commit until consistency state T+1. Transaction TXB is associated with consistency state T, but has not committed. Because of the property of isolation, no transaction should be able to see the changes made by TXB.
Transaction TXB also wishes to read DATA1. The version of DATA1 that TXB should see should reflect the change made by TXA but not TXC. The current version of DATA1 does not meet this requirement because it reflects changes made not only by TXA but also by TXC. However, the changes made by TXC may be removed from DATA1 to produce a “derived” version of DATA1, which may then be supplied to TXB. The derived version may be generated by applying undo records associated with TXC to the current version.
Assuring cache coherency and ACID compliance in a multi-server environment requires complex processing by and cooperation between database servers, use of very sophisticated protocols, software, and handshaking, as well as extensive network communication between the database servers.
Multi-Tier Database System
One of the problematic areas for cache management is management of caches in the middle tier of a multi-tier database system. A multi-tier database system has a database server in the first tier, one or more computers in the middle tier linked to the database server via a network, and one or more clients in the outer tier.
A client commences a transaction (“client transaction”) by issuing a query, via the middle tier, to a database server to request one or more result sets. In response to issuance of the query, the database server generates the result set based on a snapshot. The result set is then stored in a cache in the middle tier. The portion of the memory in the middle tier in which the result set is stored is referred to as the result set cache. When executing the client transaction, data is read from the result set cache and changes made by the client transaction are made to data in the result set cache and to the database. To commit the client transaction, the changes are committed to the database server.
As with any client in a transaction processing system, a client of a result set cache in the middle tier should be provided data from the cache that conforms to ACID properties. Thus, the logic on which the coherency of the result set cache depends is based on ACID properties. There are inconsistencies that arise between data items in the result set cache and the database server that cache invalidation should account for. To illustrate these inconsistencies and the reasons they arise, the following example is provided.
In the example, result sets requested by a client include a result set order set containing records representing an order and a result set order lines set containing records representing order line items of an order. Orders are represented by a table order. Order line items are represented by a table order lines. The products are represented by a table product. Product contains a column product_number representing the product number of an ordered product. Records in order lines set representing the order line items were produced by a query that joined order, order lines, and product. As a result of the join, the records in the result set contain a corresponding column product_number. The result set also contains a record OL1. The order table contains a column number_of_line_items representing the number of line items in the order. A record in order set contains a corresponding column.
During execution of a client transaction, there may be inconsistencies between the result set cache and the database server that arise for several reasons. The inconsistencies fall into one of several categories depending on what caused the inconsistency. The first category is referred to as “committed transaction inconsistencies”. This type of inconsistency is caused by transactions, other than the client transaction, that are committed by the database server after the snapshot point of the result set cache. For example, a source data item for the result set cache may have been changed by another transaction committed by the database server after the snapshot point of the result set. Thus, the cache copy of the source data item in the result set cache is incoherent. Referring to the current illustration involving order set and order lines set, assume that after order set is generated, another transaction changes the product_number column in the product table for the row corresponding to record OL1. The other transaction is committed after the result set is generated but before the client transaction is committed on the database server. When the client transaction is later committed, the value of the product_number in record OL1 is not consistent with the corresponding column and row in the product, and therefore the result set cache is not coherent.
The second category of inconsistencies is referred to as “active transaction inconsistencies”. These are inconsistencies caused by uncommitted changes made as part of the client transaction. In general, these are changes that should be triggered by the uncommitted change but are not. This may occur, for example, when there is “business” logic on the database server that the database server is configured to execute but the client is not. Referring to the current illustration, a client transaction inserts a row inserted into order lines. The client adds the row to the result set cache and invokes an API (“Application Program Interface”) provided by the database server for inserting the row. In response, the database server inserts the row, which causes a trigger to invoke a stored procedure. The stored procedure increments the number of line items column in the corresponding row in order. While the database server is configured to execute this stored procedure, the client is not configured to execute this procedure or similar logic to update the number_of_line_items in order set when inserting a record into order lines set. The result set cache, and in particular, the value of number_of_line_items as stored in the result set cache, is incoherent even before the client transaction is committed.
To maintain cache coherency, both during the execution of a client transaction and after committing a transaction, there is a need to invalidate the incoherent data within a result set cache. Unfortunately, there exists no cache invalidation mechanism that invalidates and/or replaces only incoherent data in the result set caches of a middle tier database system. Development of such a cache invalidation mechanism has been stymied by the difficulty of tracking or detecting when a cache copy in the result set cache becomes incoherent as a result of changes made by a client transaction, both during and after commitment of the transaction. Thus, the conventional approach to making the result set cache coherent is simply to require the client to request regeneration of another result set, which is regenerated by the database server and communicated back to the middle tier, where it is stored in place of the older version of the result set.
Based on the foregoing, there is clearly a need for a mechanism that tracks and identifies what changes need to be made to maintain the coherency of result set caches, and to do so efficiently.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.