1. Technical Field
This disclosure relates to data cache servers, to data cache clients, to data stores, and to inconsistencies between cached data and data in a data store on which the cashed data was based.
2. Description of Related Art
The workload of certain application classes, such as social networking, may be dominated by queries that read data. See F. Benevenuto, T. Rodrigues, M. Cha, and V. Almeida, “Characterizing user behavior in online social networks,” in Internet Measurement Conference, 2009. An example is a user profile page. A user may update her profile page rarely, such as only once every few hours, days, or even weeks. During these same periods, these profile pages may be referenced and displayed frequently, such as every time the user logs in and navigates between pages.
To enhance system performance, these applications may augment a data store, such as a standard SQL-based relational database management system (RDBMS), e.g., MySQL, with a data cache server. The data cache server may use a Key-Value Store (KVS), materializing key-value pairs computed using normalized relational data. A key-value pair might be finely tuned to the requirements of an application, e.g., dynamically generated HTML formatted pages. See J. Challenger, P. Dantzig, and A. Iyengar, “A Scalable System for Consistently Caching Dynamic Web Data,” in proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies, 1999; A. Iyengar and J. Challenger, “Improving Web Server Performance by Caching Dynamic Data,” in proceedings of the USENIX Symposium on Internet Technologies and Systems, pages 49-60, 1997; C. Amza, G. Soundararajan, and E. Cecchet, “Transparent Caching with Strong Consistency in Dynamic Content Web Sites,” in Supercomputing, ICS '05, pages 264-273, New York, N.Y., USA, 2005, ACM; V. Holmedahl, B. Smith, and T. Yang, “Cooperative Caching of Dynamic Content on a Distributed Web Server,” in HPDC, pages 243-250, 1998; K. S. Candan, W. Li, Q. Luo, W. Hsiung, and D. Agrawal, “Enabling dynamic content caching for database-driven web sites,” in SIGMOD Conference, pages 532-543, 2001; A. Datta, K. Dutta, H. M. Thomas, D. E. VanderMeer, and K. Ramamritham, “Proxy-based Acceleration of Dynamically Generated Content on the World Wide Web: An Approach and Implementation,” ACM Transactions on Database Systems, pages 403-443, 2004. The KVS may manage a large number (billions) of such highly optimized representations.
A cache augmented SQL RDBMS (CASQL) may enhance performance dramatically because a KVS look up may be significantly faster than processing SQL queries. This explains the popularity of memcached, an in-memory distributed KVS deployed by sites such as YouTube, see C. D. Cuong, “YouTube Scalability”, Google Seattle Conference on Scalability, June 2007, and Facebook, see P. Saab, “Scaling memcached at Facebook”, December 2008; R. Nishtala et. al., “Scaling Memcache at Facebook,” in 10th USENIX Symposium on Networked Systems Design and Implementation, 385-398 (2013).
With CASQLs, a consistency technique may maintain the relationship between the normalized data and its key-value representation, may detect changes to the normalized data, and may invalidate the corresponding key-value(s) stored in the KVS. Other possibilities include refreshing, see J. Challenger, P. Dantzig, and A. Iyengar, “A Scalable System for Consistently Caching Dynamic Web Data,” in proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies, 1999; S. Ghandeharizadeh and J. Yap, “Cache Augmented Database Management Systems,” in Third ACM SIGMOD Workshop on Databases and Social Networks, 2013, or incrementally updating, see P. Gupta, N. Zeldovich, and S. Madden, “A Trigger-Based Middleware Cache for ORMs,” in Middleware, 2011, the corresponding key-value. Almost all techniques may suffer from race conditions, as explained in more detail below. The significance of these race conditions has been highlighted in D. R. K. Ports, A. T. Clements, I. Zhang, S. Madden, and B. Liskov, “Transactional consistency and automatic management in an application data cache,” in OSDI. USENIX, October 2010. This article describes how a web site may decide to not materialize failed key-value lookups because the KVS may become inconsistent with the database permanently.
As an example, consider Alice who is trying to retrieve her profile page while the web site's administrator is trying to delete her profile page due to her violation of the site's terms of use. Below is a discussion that shows how an interleaved execution of these two logical operations may leave the KVS inconsistent with the database such that the KVS reflects the existence of Alice's profile page, while the database is left with no records pertaining to Alice. A subsequent reference for the key-value pair corresponding to Alice's profile page thus may undesirably succeed, incorrectly reflecting Alice's existence in the system.