A content delivery network (CDN) is a distributed platform that delivers customer (i.e., content provider) content to different end users from different distribution points within a digital network. The CDN operates different points-of-present (PoPs) throughout the digital network which form the content distribution points. One or more CDN caching servers operate in each of the PoPs. The CDN caching servers cache and serve the customer content on behalf of the CDN customers. The CDN effectively fans out the customer content from the customer origin to a larger set of CDN caching servers. The larger set of CDN caching servers then redistributes the customer content to an even larger set of end users while optimizing the delivery by localizing the distribution points from which end users receive the content from the CDN. The customer origin has minimal distribution capacity relative to the collective distribution capacity of the CDN as provided by the numerous CDN caching servers. The customer origin is typically formed from one or more origin servers that are under the customer or content provider control with the origin servers being the point of origin for the customer content.
The CDN distribution model shields the customer origin from the high volume of end user requests requesting the customer content. Nonetheless, the CDN imposes its own load on the customer origin servers. In particular, the CDN caching servers make at least one access to the customer origin servers in order to obtain and cache copies of the original customer content that are then redistributed in response to the end user requests.
To reduce the load that the CDN caching servers impose on the customer origin, the CDN typically designates one or more caching servers from a larger number of caching servers within each PoP to cache and distribute content of different customers. Persistent request distribution ensures that end user requests for particular customer content received at a CDN PoP are directed to the same one or more caching servers within the PoP that are designated to distribute that particular customer content.
However, there are cases when demand for particular content spikes and the request rate (i.e., number of request for particular content over an interval) or the byte rate (i.e., bytes per second) for the particular content surpasses one or more “hot” thresholds. In such cases, the CDN dynamically allocates additional caching servers within the PoPs to handle the increased load and have more resources dedicated to caching and distributing the hot content.
The problem with this dynamic allocation of additional caching servers to handle hot content is that each newly allocated caching server imposes additional load on the customer origin server. Each newly allocated caching server performs at least one request to and retrieval from the origin server in order to obtain its own copy of the hot content for redistribution. For example, if each of 20. PoPs dynamically allocate 10 caching servers to serve hot content of a particular customer, then 200 CDN caching servers may contemporaneously request the same content from the same customer origin. This spike in traffic to the customer origin occurs when streaming popular live, linear, or programmed events, serving content related to a trending topic, program, or news, or because of temporal spikes that occur on certain holidays or at certain times of the day as some examples.
For sufficiently hot content, the demand from the dynamically allocated caching servers from different CDN PoPs can overwhelm the customer origin. This problem is exacerbated when the customer content is regularly updated and can only be cached for a short period of time, or contains dynamic elements that require retrieval from the customer origin for each request, for each new session, or for each new user. The dynamic hot content scaling performed by the CDN can effectively undo the request shielding that the CDN is supposed to provide to the customer origin.
There is therefore a need to shield the customer origin even when customer content becomes hot and the CDN or other distributed platform allocates additional servers or resources to meet end user demand. In particular, there is a need to enable the dynamically allocated servers to efficiently pre-fetch or retrieve in real-time the hot content that originates from the customer origin servers while simultaneously reducing or eliminating the load the dynamically allocated servers impose on the customer origin servers. In other words, there is a need for the CDN to dynamically scale to allocate additional resources to satisfy high demand for hot content while maintaining a shield protecting the customer origin from excess load.