Distributed platforms, such as content delivery networks (CDNs), operate a distributed set of servers for delivering content and services to requesting users spread across the Internet. A primary objective of the distributed platform is to optimize and improve the content and service delivery performance of its servers so that the content and services are served to requesting users in less time.
Caching is one method by which the distributed platform improves content delivery performance. The distributed platform deploys different sets of caching servers to different geographic regions. Each set of caching servers deployed to a particular region is referred to as a point-of-presence (PoP). The location of each PoP is specifically selected to be geographically proximate to a large population of content requesting and consuming users. The caching servers cache content provider content by retrieving the content provider content from content provider origin servers and temporarily storing (i.e., caching) copies of the content in memory. The distributed platform routes user requests to the caching servers that are closest to the requesting users. The caching servers are then able to respond to the requests by serving the cached copies of the content from memory without having to retrieve the content again from the more distant origin servers.
To maximize the cache footprint and cache utilization of each PoP, the distributed platform places directors in each of the PoPs. The one or more directors of a particular PoP control the distribution of user requests across the caching servers of that particular PoP. In some cases, the directors maximize the cache footprint and cache utilization of the PoP caching servers by performing a persistent request distribution. In particular, a director operating in PoP with a particular set of caching servers routes requests for the same content to the same caching server of the particular set of caching servers. In doing so, each caching server of the set of caching servers is tasked with caching and delivering a unique subset of the overall content cached within the PoP. This reduces the number of distributed platform caching servers that retrieve content from a content provider's origin server, maximizes cache-hit ratios, and reduces redundant caching of the same content in different caching servers of the same PoP, thereby allowing the PoP to cache a greater total number of unique content than if multiple caching servers of the same PoP were to cache different copies of the same content. Directors typically perform the persistent request distribution by hashing a request Uniform Resource Locator (URL) and using the hash result to select from one of the caching servers of the PoP.
In real-world scenarios, persistent request distribution can suffer inefficiencies that degrade or otherwise lessen caching server performance and overall content delivery performance of the distributed platform. Inefficiencies arise when the content provider content delivered by the distributed platform caching server involves a mix of cacheable and uncacheable content.
Cacheable content is any static content that remains unchanged for some period of time and is not customized on an individual user basis. Consequently, the same copy of cacheable content can be served to different users. Cacheable content includes images and media streams as some examples.
Uncacheable content is dynamic content or content that is in some way customized on a per request or per user basis. Uncacheable content includes secure websites that are delivered after a user login or ecommerce sites that are customized based on prior search or purchase activity of the user. As the name implies, uncacheable content is content that, for the most part, cannot be cached by the distributed platform caching servers. Each uncacheable content request received by a caching server triggers a retrieval back to the content provider's origin server in order to obtain the content.
Uncacheable content mixed with cacheable content and persistent request distribution creates inefficiencies in distributed platform performance because caching servers that receive the uncacheable content requests spend more time and resources in responding to the uncacheable content requests than caching servers that receive and respond to cacheable content requests. Caching servers receiving uncacheable content requests retrieve the requested content from a content provider origin server or dynamically generate the content, whereas caching servers receiving cacheable content requests simply serve copies of the content from cache with no access back to the content provider origin server and with little to no processing of the content. For these reasons, persistent request distribution involving requests for cacheable and uncacheable content can lead to disproportionate loads on the caching servers.
There is therefore a need to better optimize request distribution across distributed platform caching servers in order to account for the different loads imposed on the caching servers by cacheable content requests and uncacheable content requests. To this end, there is need to differentiate the distribution of cacheable content requests from the distribution of uncacheable content requests.