Distributed applications play an increasingly crucial role in business critical enterprise operations. Consequently, they continue to grow in scale and complexity. Users and providers of computing systems and distributed enterprise applications therein value application-level performance because an unresponsive application may directly reduce revenue or productivity. Unfortunately, understanding application-level performance in complex modern distributed systems is difficult for several reasons. Today's commercial production applications are composed of numerous opaque software components running atop virtualized and poorly-instrumented physical resources. Furthermore, the workloads imposed by such applications on distributed systems are highly nonstationary in the sense that the relative frequencies of transaction types in the workload vary considerably over time.
To make matters worse, applications are increasingly distributed in enterprise systems that span both geographical and organizational boundaries. Each enterprise system often executes each application, and each sub-application therein, on separate machines (e.g., servers) to isolate the effects of software faults and workload spikes. The machines themselves may span both geographical and organizational boundaries. Thus, compared with such machine-granularity application boundaries, application consolidation (i.e., consolidating multiple applications that run on individual, often under-utilized, machines or systems onto a smaller number of more highly-used ones) has several advantages, including better resource utilization and lower management and maintenance overheads. However, the mere task of collecting in one place sufficient performance data and knowledge of system design to support a detailed performance analysis for a system evaluation or an application consolidation is often very difficult in practice. Also, rapidly-changing application designs and configurations limit the useful life-span of an analysis once it has been performed. In addition, workload fluctuations in consolidated environments may have complex effects on application-level performance which reduce the overall predictability of the system.
For the above reasons, operators and administrators seldom analyze running production systems except in response to measurements (or user complaints) indicating unacceptably poor performance or in response to a need or desire for application consolidation.