Electronic computing has evolved from primitive, vacuum-tube-based computer systems, initially developed during the 1940s, to modern electronic computing systems in which large numbers of multi-processor computer systems, such as server computers, work stations, and other individual computing systems are networked together with large-capacity data-storage devices and other electronic devices to produce geographically distributed computing systems with hundreds of thousands, millions, or more components that provide enormous computational bandwidths and data-storage capacities. These large, distributed computing systems are made possible by advances in computer networking, distributed operating systems and applications, data-storage appliances, computer hardware, and software technologies.
In recent years, management tools have been developed to monitor the performance and capacity of the numerous and various components of distributed computing systems and generate alerts that are used to notify administrators of problems or significant changes to the infrastructure, applications, and other anomalous behavior of system components. However, these management tools can generate multiple alerts with implications of noise and alert fatigue. In addition, the same alert may have different meanings in different environments. Currently, administrators speculate on how an alert may impact the health, risk, or efficiency of a distributed computing system based on previous experience with alerts generated by the management tools. As a result, evaluating the impact of problems to the distributed computing system prioritizing response to the various alert is not uncertain. Administrators seek methods to prioritizing the various types of alerts for optimal troubleshooting of the distributed computing infrastructure.