1. Field of the Invention
The present invention relates generally to information processing environments and, more particularly, to a cache management system providing improved page latching methodology.
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Addison Wesley, 2000.
Caching data is an important performance optimization that is employed in database systems as well as a large number of other systems and applications. The premise behind caching data is that most programs (e.g., application programs) typically access data that is primarily localized within a few files or pages. Bringing that data into memory and keeping it there for the duration of the application's accesses minimizes the number of disk reads and writes the system must perform. Without caching, applications require relatively expensive disk operations every time they require access to the data.
In the context of a database management system, a cache is typically employed to hold database files in memory. FIG. 1 is a high-level block diagram illustrating data structures of a cache. Database files are typically organized as an array of fixed-size units referred to as “pages”. A cache entry typically holds a single page image. These images are usually stored in any array 110 to satisfy alignment requirements. Typically the cache is not large enough to hold the entire database and as a result pages must be brought into (and out of) the cache in response to requests for access to particular data. In addition, the requests for access to particular data include both requests to read data as well as requests to write data (e.g., create or update data records). Accordingly, management of the cache includes not only managing the process of bringing items into and out of the cache, but also involves coordinating read and write access to the data. For example, if a writer were updating a page, one would like to ensure that no readers have access to the page until the update was finished. Thus, in conjunction with management of the cache, state information 120 is also maintained for each entry (e.g., an ordinal indicating what page it contains, an indicator indicating whether or not it has been changed, etc.). The data structure holding the state information 120 for a cache entry (referred to herein as “page infos” or “infos”) may be inlined with the page images or may be maintained in a separate data structure as illustrated at FIG. 1.
Cache entries are typically indexed in order to facilitate access. For example, an array 130 indexed by a hash of the page name could be used to enable efficient by name look up. As illustrated at FIG. 1, to find the cache entry for a particular page by name, one would hash the page name and use the result as an index into an array 130 to obtain a pointer to a cache entry. It is possible that more than one page in the cache may have a name that hashes to a given value, and therefore the pointer in the indexed array 130 becomes a pointer to a chain or list 140 (referred to herein as “cache chains”) of cache entries. However, searching the cache chains 140 is still much faster than searching a list of all of the cache entries one by one in order to locate a particular page of interest.
The process of searching for a particular page in the above-described cache typically proceeds as follows. A search for a page having a particular name would start with hashing the name and using the result as an index into an array 130 of cache chains. The search would then follow the pointer(s) through the entries (if any) of the designated cache chain. If a cache entry for the page is found in the cache chain, this is known as a “cache hit”. However, if the page is not found in the cache, this is known as a “cache miss”. In the event of a cache miss, the page is brought into the cache (e.g., from disk) and its cache entry is linked into the appropriate cache chain. Adding a page to the cache can involve evicting a page from the cache to free up a cache entry for the new page.
The above approach is widely used and works well in single-threaded environments. However, in a multi-threaded environment the above-described cache structure may be problematic. For example, a cache chain (i.e., linked list) accessed by a particular thread may not be in good shape because another thread is currently in the process of updating the linked list. As another example, two threads may attempt to update the same page at the same time, which may result in lost updates. In order to address these kinds of problems in a multi-threaded environment, the traditional solution is to have a mutex (or mutual exclusion object) associated with each of the cache chains, in order to manage concurrent access to such cache chains as well as updates to the page state information found in the infos (the state for the latches used to control access to the page contents are typically found here). The mutex (mutual exclusion object) is a program object that allows multiple program threads to share the same resource, such as file access, but not simultaneously. As illustrated at FIG. 1, an array element 135 at the head of a particular cache chain includes a mutex for managing access to its cache chain.
Updating a particular page in a traditional cache using chain mutexes as described above generally proceeds as follows. The thread performing the update locates the cache chain containing the page and then acquires the chain's mutex. After finding the page's cache entry on the chain, the thread acquires exclusive access to the page (effectively by manipulating some of the state in the info associated with the cache entry, while under the guard of the mutex). Obtaining exclusive access might involve waiting for other thread(s) to finish with the page. If this is the case the chain mutex will be given up temporarily, and then re-acquired. Once exclusive access is acquired, the mutex is released. After making the desired update to the page, the thread then relinquishes the exclusive access granted to it (typically requiring the acquisition of a mutex, manipulating some state, and then the release of that mutex). The above is a general example of the operations that may be required to update a page. Those skilled in the art will appreciate that variations are possible. Significantly, a similar process is also involved when a thread is simply reading a page that is in cache.
A disadvantage of the above approach in a multi-threaded environment is that the above process of managing access to pages in the cache has adverse implications on the overall performance of the system maintaining the cache (e.g., a database management system). Typically, the most common cache operation performed is obtaining read access to a page that is already in cache. For example, in a multi-threaded, multi-processor database system environment, a page that is often of interest is the root page of an index. It is quite likely that multiple threads may frequently try to read this particular page. However, as only one thread at a time may obtain a mutex, “convoys” can result when threads queue to wait for the chain mutex associated with this page (which must be acquired and released during the process of finding the cache entry for the page and obtaining shared access to it). For example, five threads may be attempting to obtain shared access to the page containing the index root page, but the serial nature of obtaining the mutex means that only one can do so at a time, thereby slowing down overall system performance.
Another problem is the number of mutex operations that are required. For example, in the above-described environment, four operations are typically required in order to read a page. A thread must first obtain and release a mutex in order to obtain read access to the page. After reading the page, the thread must also obtain and release a mutex in order to relinquish the granted read access. Acquiring and releasing a mutex typically is implemented using a “compare and swap” or some other synchronizing instruction. These instructions are typically much more expensive, in terms of system performance, than one might expect for the fairly simple operations they perform. Because these operations are expensive, it would be preferable to use as few of them as possible, thereby enabling overall system performance to be improved.
What is needed is an improved cache management solution that reduces the number of expensive, serializing operations that are performed and thereby provides improved system performance. In particular, a solution is needed that provides improved performance in a multi-threaded, multi-processor environment. The present invention provides a solution for these and other needs.