Field of Invention
Embodiments of the present invention generally relate to techniques for assessing the resiliency of a distributed computing service provided by a collection of interacting servers.
Description of Related Art
A broad variety of computing applications have been made available to users over computer networks. Frequently, a networked application may be provided using multiple interacting computing servers. For example, a web site may be provided using a web server (running on one computing system) configured to receive requests from users for web pages. The requests can be passed to an application server (running on another computing system), which in turn processes the requests and generate responses passed back to the web server, and ultimately to the users.
Another example includes a content distribution system used to provide access to media titles over a network. Typically, a content distribution system may include access servers, content servers, etc., which clients connect to using a content player, such as a gaming console, computing system, computing tablet, mobile telephone, network-aware DVD players, etc. The content server stores files (or “streams”) available for download from the content server to the content player. Each stream may provide a digital version of a movie, a television program, a sporting event, user generated content, a staged or live event captured by recorded video, etc. Users access the service by connecting to a web server, where a list of content is available. Once a request for a particular title is received, it may be streamed to the client system over a connection to an available content server.
The software applications running on systems such as these are often updated as ongoing development results in patches to fix vulnerabilities or errors as well upgrades to make new features available. At the same time, the servers in a networked application may depend on one another in unforeseen or unintended ways and changes to one system may result in an unintended dependency on another. When this happens, if a server fails, then access to the networked application can be disrupted.