Data centers can contain thousands of servers (both physical and virtual machines), with each server running one or more software applications. The servers and software applications generate log stream records to indicate their current states and operations. For example, software applications may output log records that sequentially list actions that have been performed and/or list application state information at various checkpoints or when triggered by defined events (e.g., faults) occurrences, etc.
The software applications are also referred to as software sources because they are sources of log stream records. Servers are one type of host that can execute software sources. Some data centers generate terabytes of log stream records every day from thousands of software sources running on thousands of hosts.
Significant processing resources and/or time may be required to determine correlations among the log stream records. Data center operations engineers (operators) may need to frequently determine such correlations in an iterative manner to analyze the root cause of problems. Because a human is in the loop, it can be important to determine the correlations in a fast and intuitive manner.