Computer hardware and software have, over the past 60 years, evolved at a spectacular rate. Early, room-size, vacuum-tube-based computer systems could muster only a tiny fraction of the processing bandwidth provided by one of the ubiquitous inexpensive microprocessors present in a wide variety of currently available electronic devices. Similarly, the data-storage subsystems and memories of early computers had capacities less than a tiny fraction of the data-storage and memory capacity of a modern smart phone. Early computers were stand-alone systems capable of running a single job, or executable, at a time. By contrast, enormous, complex virtualized data centers and cloud-computing facilities may include hundreds, thousands, or more server computers linked together with high-speed digital communications infrastructure and accessing enormous dedicated data-storage facilities to simultaneous execute myriad programs. Virtual data centers and cloud-computing facilities may be geographically distributed and generally provide Internet-based interfaces that allow remote users to configure and launch complex constellations of application programs distributed across large numbers of physical computer systems and accessed by thousands, tens of thousands, hundreds of thousands, or more concurrent users.
While the enormous sizes, complexities, and capacities of modern distributed computing systems provide great benefit to commercial and organizational users, the complexity and scale of these systems also presents a variety of challenges. One frequently encountered challenge is the need to monitor the operational states of complex multi-tiered applications and to determine one or more causes of undesirable operational states. Many different types of monitoring facilities are generally embedded within complex computational systems. They may produce enormous volumes of log-file entries, digitally encoded metrics, and other information that characterizes the operational states of the many different systems, subsystems, components, and subcomponents of a virtual data center or cloud-computing facility that hosts one or more multi-tiered applications. While this large body of continuously generated information generally contains sufficient information to diagnose many different types of failures and undesirable operational states, it is a decidedly non-trivial task to process the large volumes of operational information to identify the generally small subset of the information relevant to diagnosis of causes underlying failures and undesirable operational states of a particular multi-tiered application.