1. Technical Field
The present invention relates to optimization of computer systems and more particularly to systems and methods using a decentralized probability active sampling approach for improving computer system performance.
2. Description of the Related Art
Performance of a computing system is significantly dependent on the choice of its various configuration parameters. An appropriate configuration setting can lead the system to the best quality of services (QoS) such as short response time, high throughput, and fairness among users. However, the growing scalability, complexity, and heterogeneity of current computing systems create many challenges in determining an optimal configuration setting for system operation. For example, today's data centers commonly include thousands of physical machines to host a variety of web applications. It is difficult for human operators to find the best configuration setting for such large systems.
Currently, a commonly used approach for system configuration relies on the default settings that come with each system component from its vendor. Such default setting gives a conservative way to deploy the system because it ignores the interdependencies among different system components. For example, the configurations of an application server in a web based system depend heavily on the particular application being deployed and the type of back-end database it interacts with. These system components are usually from different vendors. It is likely that non-optimal performance is experienced when these system components work together with their default configurations.
Therefore, a need exists to develop methods to automatically discover a configuration setting that can optimize the performance of a computing system in its entirety.
Due to the increasing complexity of computing systems, the automatic identification of a system's optimal configuration is important to large system optimization and management. Several approaches have been developed in recent years to deal with this problem. These approaches formulized the problem as an optimization problem and resorted to different algorithms to search for the best configuration. However, compared with many standard optimization techniques such as gradient based algorithms, these algorithms are dealing with an unknown, non-convex function with multiple local maxima.
A recursive random sampling (RRS) approach has been used to discover a configuration space based on the initial high efficiency feature of random sampling as well as the constantly restarting mechanism for random sampling with adjusted sampling space. A smart hill climbing (SHC) algorithm has also been proposed using the ideas of importance sampling and Latin Hypercube Sampling (LHS). This approach estimated a local function at each potential region and searched towards the steepest decent direction of the estimated function.
In the Active Harmony project, a simplex based direct search was utilized to optimize the unknown performance function with respect to configuration settings. This method forms a simplex in the parameter space by a number of samples, and iteratively updates the simplex through actions including reflection, expansion and contraction to guide the new sample generation. However, the simplex based search only works for a small number of configuration parameters, and is easy to get stuck in local optima.
When the number of parameters is large, another approach decomposed the configuration parameters into several small subsets by modeling the dependencies between different parameters. The simplex method is then conducted in each subset of parameters. Other approaches where also proposed.