A typical data storage system includes a cache device that stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If the requested data is contained in the cache (herein referred to as a cache hit), this request can be served by simply reading the cache, which is comparatively faster. On the other hand, if the requested data is not contained in the cache (herein referred to as a cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower. Hence, the greater the number of requests that can be served from the cache, the faster the overall system performance becomes.
During a cache miss, the storage system may evict a cache entry (also commonly referred to as a cache slot) in order to make room for the new requested data. As used herein, evicting a cache entry refers to the reusing of the cache entry to store new data. The heuristic used to select the cache entry to evict is known as the replacement policy. One popular replacement policy, “least recently used” (LRU), replaces the least recently used cache entry. Conventionally, to implement the LRU policy a single linked list of elements is maintained, wherein each linked list element is mapped (i.e., logically linked) to a cache entry.
When a cache entry is accessed, its corresponding linked list element is moved to the head of the linked list. Thus, an ordered linked list is maintained based on access time and the tail of the linked list contains the LRU entry that is chosen when eviction is needed. Such a conventional mechanism for evicting cache entries works well if the number of threads accessing the cache entries and updating the linked list is relatively low. For highly multi-threaded environments, however, the head of the list quickly becomes a bottleneck because many threads are simultaneously trying to lock the head of the list in order to insert their recently accessed element. Locking the linked list prevents other threads from updating the linked list. Thus, the system does not perform as well as expected because streams sit idle waiting to access the head of the linked list.
FIGS. 1A-1C are block diagrams illustrating linked list 110 maintained by a conventional system for implementing the LRU policy. Linked list 110 includes linked list elements 111-116, wherein each linked list element corresponds to a cache entry (not shown). Linked list element 111 is the head element and corresponds to the most recently used (MRU) cache entry. Linked list element 116 is the tail element and corresponds to the LRU cache entry. Each linked list element contains pointers (not shown) linking it to other elements in the linked list. For example, linked list element 112 contains a pointer pointing to previous element 111 and a pointer point to next element 113. Singularly-linked lists, however, only contain within each of its elements a pointer pointing to the next element. For example, if linked list 110 was singularly linked, linked list element 112 would only contain a pointer pointing to element 113. Further, linked list 110 includes head pointer data structure 150 that contains a pointer pointing to its head element 111. In FIGS. 1A-1C, each linked list element is shown with a letter followed by a colon and a number (e.g., “A:10”). Here, the letter represents the content currently stored at the corresponding cache entry, and the number represents the timestamp of when the corresponding cache entry was last accessed. Thus, in the example “A:10”, the linked list element corresponds to a cache entry which contains content “A”, which was accessed at time “10”. Further, a bolded box indicates the linked list element is locked. As illustrated in FIG. 1A, linked list elements 111-116 contain the content:timestamps of A:10, B:9, C:7, D:5, E:3, and F:1, respectively.
Referring now to FIG. 1B, which illustrates a first thread accessing content C from the cache entry corresponding to element 113 at time 11. Thus, the first thread locks head pointer 150, and moves linked list element 113 to the head of the linked list. The first thread updates linked list element 113 with the timestamp of when the cache entry was accessed (i.e., 11). After linked list element 113 has been updated, the first thread unlocks head pointer 150. Note that during this process, other threads may be contending for access to linked list 110. In such a scenario, the other threads are stalled until the first thread has completed its processing of linked list 110. In a system having multiple threads, such a limitation can have a severe impact on system performance.
Referring now to FIG. 1C, which illustrates a second thread evicting the cache entry corresponding to element 116 at time 12. In this example, the second thread evicts content F and populates content G in the cache entry. Thus, the second thread locks head pointer 150, and moves element 116 to the head of the linked list. The second thread updates linked list element 116 with the timestamp of when the cache entry was populated (i.e., 12). After linked list element 116 has been updated, the second thread unlocks head pointer 150. Note that the requests to update linked list elements 113 and 116 (and possibly numerous other requests) may occur simultaneously. In such a scenario, the threads are stalled until the first thread has completed its processing of linked list 110.
FIG. 2 is a timeline diagram illustrating multiple threads contending for access to a linked list in a conventional implementation of the LRU policy. FIG. 2 shall be described with reference to FIGS. 1B-1C. Referring now to FIG. 2, during time period 210, a first thread has locked a linked list in order to update the head of the list. For example, in FIG. 1B, the first thread locks linked list 110 in order to update and move element 113 to the head of the list. During time period 211, a second thread is contending for access to the linked list. For example, the second thread of FIG. 1C contends for access to linked list 110 in order to update element 116. The contention may occur, for example, while the first thread is updating the linked list as shown in FIG. 1B. During time period 212, the second thread has gained access to the linked list, and updates the element. For example, the second thread of FIG. 1C updates element 116 by moving it to the head of linked list 110. Note that during time period 211, the second thread is stalled, waiting for access to the linked list. Embodiments of the present invention overcome these limitations by providing mechanisms for concurrent updating of elements corresponding to cache entries.