This disclosure relates generally to computer-based mechanisms for data cache management, and more particularly to techniques to improve read or write access for global data cache structures, and manage expiring data pages and data cache free space.
Various database systems use in-memory cache to speed up access to data. There are various well-known cache management and data replacement techniques, such as Least Recently Used (LRU), a rule by which a page is selected to be removed if it has been used less recently than any other page. Most of, if not all, caching implementations for multi-processor systems need to synchronize on some central lock in order to access a page or resource. This causes significant stalls in data processing on multi-processor systems, as synchronization on global locks is very costly there (on the order of hundreds to thousands of CPU cycles).
FIG. 1 illustrates an exemplary data structure 100 for a data processing system. For each central processing unit (CPU) in the data processing system, there is a dedicated data structure for CPU state 102. Each CPU state 102 is associated with several queues 104, which hold page usage information, free “buckets” of memory for memory allocations, and currently-held pages for reading or writing.
Tasks (T1, T2, . . . ) 108 on each CPU access these CPU state 102 data structures to reserve access to data cache 101. Since tasks 108 are assigned to a CPU, any operations on the CPU state 102 and associated queues 106 do not need to be synchronized between tasks on the CPU 102. No task 108 accesses the CPU state 102 of another CPU.
In addition to CPU-specific state 102, a number of global structures represent the data cache 101 itself, and includes the actual data cache buckets containing database pages, LRU queue, data cache control blocks and hash table, global locks, etc. All tasks 108 need to access these global structures. There are at least two global locks. An LRU global lock 112 protects the LRU queue, while a free list global clock 114 protects the list for free cache buckets.
In operation, an LRU process needs two global variables with pointers to the LRU head and the LRU tail of the LRU queue, to be able to quickly access both ends of the LRU queue. The LRU head is used to chain in recently used buckets; the LRU tail is used to quickly find buckets that have been unused for a predetermined time and which are candidates for expiration. As the data cache 101 becomes full, an expirer process 110 expires old buckets and frees up the memory for new pages which are loaded from another source, e.g., from secondary storage.
Read Access
A given page p must be present in the cache for providing read access to it. If not, it has to be loaded from secondary storage. Furthermore, it must be ensured that page p will not be expired from cache as long as it is accessed. The conventional method to providing read access includes locking the LRU queue to prevent expires, and finding page p in the cache. If page p is not found, the LRU queue is unlocked to allow parallel processes to continue, page p is loaded from secondary storage, the LRU queue is locked again and page p is put to the head. After this, page p can be marked as used for reading. The LRU queue can then be unlocked, the data access performed, the LRU queue is relocked, page p is marked as unused and moved to the head of LRU, and the LRU unlocked again.
The expirer process 110 typically ignores pages that are in use when expiring pages from the LRU tail of the LRU queue. In general, this requires locking the global LRU lock and freelist lock, which is a big contention point. There are various “tricks” to limit the impact of this global lock, such as, for example, using several LRU queues hashed by page ID, but it is not possible to remove the contention using conventional algorithms. Also, such solutions usually worsen the cache efficiency. For instance, in situations where there exist several independent LRU queues, in the worst case there is significant I/O overhead. Further, to mark page used for reading, conventional algorithms usually also do a shared lock on the page, which again adds contention.
Write Access
In conventional methods, a page is locked exclusively for write and shared for read, i.e., all readers are excluded when a writer tries to write the page and the writer is excluded when any reader runs. Access to the page happens similar to read access.
Expiring Data Pages
In case the data cache is full, it is necessary to remove or expire some least-used pages from the cache, based on LRU policy. For this purposes, the expirer process 110 will be triggered. The expirer process 110 itself is usually a singleton, i.e., there may not be several expirer processes running in parallel. The expirer process 110 is adapted to remove some data cache buckets and make them free for reuse.
In a conventional method for expiring data pages, the expirer process 110 locks LRU queue (global lock), picks one or more unused pages from an LRU tail, unchains selected pages from the LRU queue, unlocks the LRU queue, and writes any modified unchained pages to secondary storage. Then, the free list is locked, unchained pages are added to the free list, and the free list is unlocked.
As can be understood, a contention exists for the LRU queue lock. Also, a second contention exists for free list lock. These two contention points are not too problematic however, as the expirer process seldom runs in comparison to cache access. But this is not a complete solution, as care must be taken to prevent writes to pages in I/O and to re-chain used pages back to LRU if those pages were used during writing.
Free Space Management
Unused pages are usually held in one of a number of types of free lists. For example, on startup, all pages are free. Or, when a page gets deleted or expired from cache, it becomes free. Free pages are then used when loading a page from secondary storage, or when creating a completely new page. In conventional methods, free list access needs to be synchronized using a global lock, which also causes contention.
Therefore, an improved data cache management scheme is needed.