Today, data centers (e.g., Internet data centers) are often used to store content associated with websites. Data centers are geographically distributed so as to provide regional storage for data content being made available from a website. By geographically distributing website content to the various data centers, server load for a particular website is geographically distributed, thereby reducing response time and avoiding network congestion. For example, when a user requests data (such as a webpage) from a website via a network (such as the Internet), the geographically proximate data center can be accessed to retrieve the requested data (webpage). As a result, the use of data centers allows the requested data to be returned to the requester faster and more reliably.
At such data centers, numerous applications are typically running on numerous computing devices (namely, servers) resident at the data centers. Since a data center normally hosts data for a large number of websites, each data center concurrently operates a set of like applications to handle the high volume of incoming requests to the various websites. Hence, within a given data center, the load on a particular type of application can be distributed across the set of like applications that are concurrently operating. The use of the concurrent applications also provides redundancy in case of failures. Nevertheless, it is not uncommon for an application or a computing device operating one or more applications to fail, shutdown or lockup. In such case, the application (and possibly also the computing device) needs to be restarted (or re-launched) in order to resume operation. For performance reasons, it is common for these applications at the data center to each utilize a cache to store likely utilized data. Advantageously, a cache can significantly improve an application's response time.
Unfortunately, however, when an application is restarted, its cache is initially empty. When the cache is empty, the application's response time in responding to requests is dramatically longer (i.e., slower) than when the cache is fully populated. Hence, when an incoming request is being served by a newly restarted application, the requester must endure significant undesired delay (e.g., latency) before a response having the requested data is supplied to the requester. Consequently, there is a need for improved approaches to better manage response latency with restarted applications.