Efficiency and performance is a differentiating factor for content delivery networks (CDNs) or other distributed platforms that operate different points-of-presence (PoPs) with each PoP hosting a different set of servers at a different network location or geographic region. One area where distributed platform performance can be greatly impacted is in the performance of tasks that are distributed across multiple distributed platform servers for execution. Completion of such tasks is dependent on the weakest link of the distributed platform.
Distribution and execution of a purge command within a distributed platform illustrates distributed platform performance degradation that can result from just one weak link in the distributed platform. To purge content across the distributed platform, a distributed platform administrative server sends a purge command to the distributed platform content delivery servers that are deployed to the different geographic regions. The purge command instructs those content delivery servers to delete or otherwise remove certain content from storage or cache. The purge command is complete once each of the instructed content caching delivery servers deletes the specified content and reports task completion to the administrative server.
Should one of the many servers performing the purge not receive the command, be unable to complete the command because of a software or hardware failure, or have problem reporting completion of the command back to the administrative server, the administrative server cannot deem the command as completed. The administrative server will then have to reissue the purge command or report a failure. Thus, a single point of failure within the distributed platform can degrade command execution performance for the entire distributed platform. This can further create a trickledown effect that further impacts the distributed platform performance. For instance, in the event old content cannot be completely purged from the distributed platform, the distributed platform may continue to serve obsolete content to certain users or be unable to free storage, delaying or otherwise preventing updated customer content from being served from the distributed platform.
Purge execution and command execution, in general, becomes more difficult as the distributed platform scales and deploys more servers to more PoPs, especially as the servers and PoPs are located in more distant and remote geographic regions. Such scaling introduces more execution points, each of which can become an additional point of failure or can increase delay in command execution completion. Scaling also increases the number of network hops and different transits or paths that the command signaling crosses in order to reach the servers. The network hops and transits themselves can experience different performance and failures. Such failures also slow the distributed platform's ability to execute distributed commands that implicate servers in different regions. The term “path” includes any arbitrary set of network routers or hops that are under control of a common transit provider through which a source operating within a first network can reach a destination operating within a second network. When different packets are sent from the source to the destination over a transit provider path, the packets can traverse different sets of routers or hops that are under control of the transit provider. The term “transit” refers to a specific path or a specific set of routers or hops under control of a common transit provider through which the source operating within the first network can reach the destination operating within the second network. In other words, when different packets are sent from the source to the destination over a specific transit, the packets traverse the same set or routers or hops. These terms will be used interchangeably in the following disclosure.
There is therefore a need to accelerate or improve distributed platform execution of distributed commands. Such acceleration or improvement can be obtained by reducing or resolving one or more of the variables that can degrade distributed platform performance, and specifically, the ability of the distributed platform to execute distributed commands across different servers operating in different regions. To this end, there is a need to improve the propagation of distributed commands across the distributed platform and reduce or resolve the potential for delay or failure that may occur if one or more paths carrying the command messaging between the distributed platform administrative server and PoPs become unavailable or underperform.