Distributed platforms, such as content delivery networks (CDNs), operate a distributed set of servers. Through the distributed set of servers, the distributed platform provides a distributed delivery of content and services to different requesting users over digital networks, such as the Internet.
The distributed platform deploys different sets of caching servers to different points-of-presence (PoPs). The location of each PoP can be selected to be geographically proximate to a different large population of content requesting and consuming users. To optimize the delivery of the content and services, the distributed platform routes user requests to the caching servers or PoP that are closest to the requesting users. The caching servers are then able to respond to the requests by serving the cached copies of the content from memory without having to retrieve the content again from the more distant origin servers.
The request distribution across the caching servers of a PoP is controlled by one or more load balancers or request directors operating in the PoP. The request directors perform an intelligent request distribution across the caching servers in order to further optimize caching server performance.
The intelligent request distribution involves the request directors routing requests for the same content to the same caching server or subset of caching servers of the PoP. In doing so, each caching server caches and delivers a unique subset of the overall content cached within the PoP. The intelligent request routing reduces the number of caching servers that retrieve content from a content provider's origin server, maximizes cache-hit ratios, and reduces redundant caching of the same content in different caching servers of the same PoP.
The intelligent request distribution also allows for specialized caching server operation. Different subsets of servers within a PoP can be configured or optimized to cache and deliver different subsets of content or types of content more efficiently than others. For instance, requests for large sized content can be distributed across a first subset of caching servers in a PoP and requests for small sized content can be distributed across a second subset of caching server in the PoP. The first subset of caching servers can be configured with different memory or cache resources and different caching and delivery operations than the second subset of caching servers so that the server performance is optimized for the particular size of content handled by that server.
In addition to or instead of content size, requests can be differentiated on the basis of cacheable or uncacheable content types, prioritized and unprioritized content types, dynamic and static content types, streaming and non-streaming content types, and supplemental (e.g., advertisement) or primary content types as some examples. These are some exemplary types with which requests can be differentiated. The request directors can differentiate requests and types based on any criteria.
For each differentiated type, a different subset of caching servers can be optimized or configured to respond to requests for that type more efficiently than other types. There is a performance penalty if a request is differentiated to an incorrect type and routed to a server that is optimized for a content type that is different than the content type for the content of the request. Accordingly, optimal content delivery performance is realized from distributing requests for different content types to the server or subset of servers that are optimized for delivering content of the requested types.
The request directors rely on deterministic methods to track requests or content directed to the different types. The deterministic methods however require prior knowledge of a request or content specified in the request in order to properly route the request to a server or across a subset of servers optimized for the request, and more specifically, the type of content specified in the request. The deterministic methods are formed in response to one or more request directors of a PoP receiving a first request for particular content, arbitrarily routing the first request to a server in a PoP, detecting the type of content that is sent from the server in response to the first request, and associating the content type with the first request or the served content. Thereafter, the next time the request for that same content is received, the request directors can properly route the request to a server or across a specific subset of servers in the PoP that are optimized for that content type. For instance, the request directors can maintain two different hash tables when the request differentiation is based on two different types. When a request is received at a particular request director, the particular request director can hash the request Uniform Resource Locator (URL) and query the different hash tables to determine if there is an existing hash for the URL in any table. If there is no hash in either table, then the request is the first such request received at the PoP for the URL requested content and the particular request director will guess as to the request type and arbitrarily distribute the request without knowing which server is optimized for the type of content associated with that request.
The inability to correctly route the first instances of different requests can create cache pollution and degrade content delivery performance of individual servers and the distributed platform as a whole. There is therefore a need for non-deterministic methods to route requests in cases where deterministic methods cannot be used or are not primed with the prior information necessary to correctly route the requests. There is a need for such non-deterministic methods to execute with minimal overhead and delay while providing a high level of accuracy.