In a managed network such as a network owned by a corporation, distributing software releases (e.g., programs, patches, and other rollouts) has become a routine and relatively straightforward procedure. This is generally because the machines are in a highly-managed environment, owned by the company, and can be divided into logical groups (e.g., Administration, Accounting, Engineering and so forth). A managed network thus enables an enterprise to manage a software release according to network load and other priorities. Further, in order to cap network utilization, alternatives such as multicast are often available.
However, when distributing content over the internet to anonymous clients on the order of millions, the task of distribution becomes substantially more difficult. One reason is that the client systems are independently owned and managed, whereby privacy and control issues prevent traditional management-style control for distribution. Further, multicast style solutions are generally not available on the internet. As a result, only connection-oriented protocols (e.g., TCP/IP) can be considered, despite the scalability issues of such connection-oriented protocols.
In such large-scale content distributions, providing new content such as a software update for downloading by client systems tends to cause huge spikes in network activity. For example, when distributing a software patch over the internet, millions of clients can try to download the patch over a very short time period, overwhelming the data center's capacity and/or saturating network taps. To avoid rejecting too many client requests, since each rejection provides an undesirable user experience, the distribution capacity of the download facilities needs to be large enough to handle such spikes. Further, to continue to satisfy clients, this capacity needs to be continually increased over time, as more and more computer systems are becoming connected to the internet. This becomes very costly to the content provider.
The problems associated with large-scale software distributions can become more severe due to other factors. For example, the spikes in network activity are more severe when a network problem occurs, a download is larger then average, a second release closely follows a previous release, and/or an operational failure that reduces capacity occurs, causing incremental loading. For example, consider a network problem such as a loss of part of the internet backbone or related routers/DNS servers. Such an event can cause increased loading of alternate network paths, and may shift loading from one region of the internet to another, e.g., a problem in the eastern United States can cause a shift of loading, forcing servers in the western United States to take the additional load as client systems are redirected to the alternate (western United States) servers.
A large patch, or a rollup of several patches that is larger than normal will also increase server loading, because each individual client will remain connected for a longer than the average time. As a result, the total number of simultaneous connections and total bytes to be served to clients will increase in a corresponding manner, straining resources and possibly causing requests to be rejected. A similar problem generally causing the same effect on the network occurs when one release closely follows a previous release, because client systems compete for both updates and client systems remain connected longer in order to pick up both packages.
An operational failure at the data center may reduce the ability of server farms to deliver updates. For example loss of a network tap, router failure, DNS configuration problems or server hardware failures can all lead to problems providing the software to requesting clients.
In sum, content distribution over the internet is a complex, costly and unpredictable task. What is needed is a way for clients to obtain content, such as software updates, that does not overload the existing infrastructure, yet without significantly increasing, and sometimes even reducing, the expense to the content provider.