Data centers can contain thousands of servers (both physical and virtual machines), with each server running one or more software applications. The servers and software applications generate performance traces indicating their present states and operations. For example, software applications may output performance traces that sequentially list actions performed and application state information at various checkpoints or when triggered by defined performances (e.g., faults) occurrences, etc.
A performance trace can be a continuous set of ordered pairs <T, V>, where T is the timestamp when the value V is observed by a trace generator. The rate at which a trace is generated can be high for critical software/hardware components. Moreover, the cumulative amount of data generated by traces from a data center can be large.
Data from performance traces are stored in trace logs. The performance traces that are recent are usually kept temporarily memory, while only the summaries and statistics of older performance traces are permanently saved. This results in two issues. First, the system needs to have sufficient network and processing bandwidth to generate the summaries and statistics and have sufficient storage capacity to handle the performance traces at peak rates and volume. Second, a loss of fidelity of the information results when older performance traces are replaced by the summaries and statistics generated therefrom.