Today's information technology (IT) systems are large in scale and have an abundance of features, resulting in a large degree of complexity in their operation. Configuring application-level parameters (e.g., time-out limits, concurrent thread limits, etc.) and system-level parameters (e.g., resource allocation, I/O bandwidth, cache size, etc.) for these complex IT systems is a challenging task for IT operators. Current management products typically set these parameters statically via offline analysis and expose interfaces that allow experienced operators to change the parameter values if needed. However, statically set values are rarely ideal across a range of workloads. In addition, it is expensive and error-prone for human operators to decide appropriate values for these parameters to meet the performance metrics (e.g., transaction response time, throughput, etc.) or quality of service (QoS) targets of individual applications.
Because the workloads of Internet servers and enterprise applications fluctuate considerably, statically allocated resources generally are either under utilized or overloaded. To address this issue, IT systems and applications generally may expose interfaces to allow dynamic resource allocation or application configuration. However, it is often difficult to configure and adjust the relevant parameters properly for a number of reasons. For instance, there are often tens or hundreds of tunable parameters (or “knobs”) that may be adjusted. Typically, only a subset of these parameters are critical to the application QoS for a given workload or under a certain operating condition. To identify in real time which knobs to tune is a nontrivial task due to the number of knobs and the complex interrelationships between the various knobs and the performance metrics. Thus, as the workload characteristics or system conditions vary over time, the key knobs that may affect performance metrics most may also change. In addition, the relationship between an application's performance metrics and its application-level configuration and system-level configuration is complex. As a result, it is challenging both to properly identify and adjust the most critical system-level or application-level parameters in response to changes in workloads or system conditions in order to meet application-level QoS targets.