1. Field of the Invention
This invention relates generally to computer systems and software. More particularly, the invention relates to workload characterization for capacity planning and performance modeling.
2. Description of the Related Art
The data processing resources of business organizations are increasingly taking the form of a distributed computing environment in which data and processing are dispersed over a network comprising many interconnected, heterogeneous, geographically remote computers. Such a computing environment is commonly referred to as an enterprise computing environment, or simply an enterprise. Managers of the enterprise often employ software packages known as enterprise management systems to monitor, analyze, and manage the resources of the enterprise. Enterprise management systems may provide for the collection of measurements, or metrics, concerning the resources of individual systems. For example, an enterprise management system might include a software agent on an individual computer system for the monitoring of particular resources such as CPU usage or disk access. U.S. Pat. No. 5,655,081 discloses one example of an enterprise management system.
Workload characterization is an important and basic step for performance modeling and capacity planning. Traditional approaches for workload characterization often require sophisticated and time-consuming user assistance. With increasing amounts of software and hardware to manage, IT professionals and performance analysts have less time to do detailed analyses of what is going on inside application boxes and of the interactions between hardware and software. Nonetheless, they and their managers want to know how their applications, represented/characterized by workloads, are performing now and how they will perform in the future.
Traditionally, workload characterization involves many steps. First, one has to partition the applications on a system into meaningful activities (e.g., workloads). Second, one maps each process that has run on the system to one or more of these activities. Third, one uses this mapping and the metrics collected by the operating system over a fixed period of time to partition the activity on the system into workloads. Once the system resources consumed by each workload are known, it is possible to infer when potential performance problems might occur as workloads grow and/or other system conditions change.
However, there are several problems with this scenario. First, the need to find workloads and to determine which processes belong to which workload can be complicated and may take an inordinate amount of a user's time. Also, the information provided by the operating system is often incomplete and/or unreliable. For instance, with many operating systems, it is often very difficult or impossible to determine how many blocks of data a particular process wrote to disk. And it is even rarer that one can determine which disks that process used. Furthermore, faulty definitions of application workloads and their consumption of resources may lead to invalid predictions of future application performance and system bottlenecks.
Therefore, it is desirable to provide an improved system and method for workload characterization.