Computer systems are often designed with tens, hundreds, or even thousands of separate components in order to realize the benefits of modularity. However, such systems can have an equally large number of potential points of failure. As the number of components in a system increases, it becomes more important to map the relationships between them in order to be able to quickly determine the root cause of an anomaly. Accurate dependency maps and methods to pinpoint the most likely root cause of an anomaly are important because, in many cases, a significant portion of a company's mission involves the reliable operation of such systems. For example, electronic network retailers and content providers can derive a substantial portion of their revenue from the sales and advertising facilitated by their computer systems, and any downtime can have a negative effect on customer traffic.
In many cases, anomalies experienced by one system component can affect the actions of another system component. In such cases, the two system components are related, and the second system component depends upon the first system component to execute properly. One problem, among others, is that such dependency relationships between the many separate system components can be difficult to map. In large-scale modern systems which undergo regular maintenance and upgrades, a dependency map must be updated each time a system component is added to, modified, or removed from the system. Without an accurate method to map the dependency relationships between the many system components, it can be difficult to determine the root cause of an anomaly experienced by one of the system components. Moreover, the number of system components that can have an effect on the operation of said system component can make it difficult to determine the root cause of an anomaly because there can be a large number of possible root causes.
Some system administrators utilize dependency maps that require them to specify the relationships between system components. Other system administrators utilize monitoring systems that require them to specify the various ways in which an anomaly in one system component can be a root cause of an anomaly in another system component. System administrators who configure these monitoring systems may determine the settings based on information from system developers who may not have a complete picture of how the various system components interact, or their own anecdotal evidence regarding which anomalies of system components may have the most substantial effect on related system components, or on recommendations from other system administrators whose systems may be operating in an entirely different environment.