Distributed computing systems managed by multiple and separate administrative units are becoming increasingly common due to developing technological trends, such as utility computing and service-oriented architectures. In these and other related scenarios, it can be difficult for any single entity to obtain information about the design of an entire system. For example, design information may be withheld from certain entities in order to maintain a competitive advantage, to preserve proprietary secrets, or to comply with legal requirements.
The inability for entities to obtain design information about the entire system can result in significant design consequences. For example, when designing highly available systems, it can be critical for redundant subsystems to have independent failure modes. That is, the failure of one subsystem should not cause the failure of a redundant subsystem. However, designing such systems can be difficult without access to deployment configurations and other information for each of the subsystems. In particular, the deployment configurations may indicate any shared dependencies (e.g., resources, network elements, etc.) that can be a common point of failure. For example, although a primary web service and a backup web service may appear to be independent, the primary web service and the backup web service may actually be hosted by the same virtual web hosting provider. As a result, any shared components between the primary web service and the backup web service become hidden shared dependencies and potentially common points of failure.
One conventional approach to identifying hidden shared dependencies is examining external observables of a system. External observables may be appropriate for identifying a limited set of shared dependencies. For example, when receiving a request for a document, the server may respond with the requested document as well as header information identifying the web server (e.g., APACHE HTTP SERVER from APACHE SOFTWARE FOUNDATION or MICROSOFT IIS from MICROSOFT CORPORATION). In this case, the header information identifying the web server is referred to as an external observable. By requesting documents from two different servers, it can be determined whether the two servers utilize the same web server by comparing the associated header information. However, external observables are only effective to the extent of the information that can be provided by the servers. In particular, external observables may not be effective for determining whether two servers share the same network connection or share the same backend database.
Another conventional approach to identifying hidden shared dependencies is deduction based on known failure occurrences. By gathering a significant amount of information regarding the failures of multiple entities, deductions can be made by examining common behavior patterns between the entities. For example, if a first subsystem commonly fails at the same time a second subsystem fails, then a correlation between the failure of the first subsystem and the failure of the second system can be deduced. However, it can take a substantial amount of time to collect sufficient information, assuming this information is even available, in order to accurately determine these correlations.