1. Field of the Invention
This invention relates to a method and apparatus for estimating a local performance index to measure the performance that is being achieved by work flowing through a particular server in a multi-system tier with multiple tiers. The invention may be utilized in the area of goal-oriented performance management of multi-tiered transaction-based work.
2. Description of the Related Art
Workload management is a computing concept whereby system resources, such as time on a central processing unit (CPU), are redistributed between jobs on a single server or redistributed between servers in accordance with how well certain performance metrics are being met. There are two general approaches in the area of workload management. Both of these approaches assume that the work being managed is running in an environment where the managed work is competing for some common resource. Examples of this include multiple managed pieces of work running on the same operating system instance or multiple operating system instances running on hardware that has been logically partitioned to allow multiple independent instances.
The first approach to workload management is a consumption-based approach. In this approach, a policy is created that describes the resource consumption constraints on particular pieces of work; these constraints are usually defined at an operating system or process boundary. An example of this would be specifying the amount of processor resource (i.e., CPU time) a database application will normally be allowed to consume. The multiple managed entities can then be ranked as to their relative importance to each other. Management can then occur via two different methods. The first is to move resources from “donor” instances that are underutilizing their allocated resources to “receiver” instances that have demand for resources beyond their allocated amount. The receivers may be prioritized by a defined relative importance. The second is to move resources from one instance to another instance even when both are consuming their allocated resource based on the defined relative importance.
The second approach to workload management, to which the present invention is directed, is a goal-oriented approach. Here, the policy states a goal such as average response time or percentile response time for a class of transactions and a relative importance for that work. Transaction-based work is the primary workload managed in this approach. Although resources are managed to attempt to meet the stated goals, this approach is different from the consumption-based approach described above. In this approach, the concept of a performance index is used. An example would be, given an average response time goal, that the performance index could be calculated by dividing the actual average response time of completed transactions by the goal. One commercial embodiment of this approach is the IBM Workload Manager (WLM) for z/OS. WLM for z/OS allows the management of single-hop transactions (a notion to be further described below) to a goal-based policy.
While WLM for z/OS effectively manages workloads for which it was originally designed, it is limited to managing transactions over only a single hop, meaning that the response time for a transaction is measured between two points within the same process or (to use the z/OS term) address space. The global response time for a particular type of transaction is then merely the sum of the response times for all of the processes that are supporting this type of transaction.
A number of previously issued patents describe this goal-oriented approach. U.S. Pat. No. 5,504,894 (Ferguson et al.), entitled “Workload manager for achieving transaction class response time goals in a multiprocessing system”, defines a performance index based on the complete response time of transactions. However, it does not deal with the problem of understanding the contribution of individual components of a multi-tiered application to overall (i.e., end-to-end) performance. U.S. Pat. No. 5,675,739 (Eilert et al.), entitled “Apparatus and method for managing a distributed data processing system workload according to a plurality of distinct processing goal types”, and U.S. Pat. No. 6,230,183 (Yocom et al.), entitled “Method and apparatus for controlling the number of servers in a multisystem cluster”, describe a “local performance index”. However, these two patents envision a group of systems, each of which completely processes a transaction, since each transaction is only a single hop. Thus the definition of “local performance index” is still based on the complete response time of the transaction, and it is local in the sense that it is based on the view of one system rather than the group of systems.
For a group of transactions running in a multi-tier, multi-system environment, it is relatively easy to determine the average response time for the collection of transactions by measuring the time for a transaction between creation and completion for each transaction and dividing by the number of transactions. Trying to estimate what the impact would be to the overall end-to-end performance by changing resources allocated to the servers or determining the bottleneck is more difficult. A view of the transactions from the point of view of a particular server in a particular tier is required.
U.S. Patent Application Publication 2005/0021736 (Carusi et al.), entitled “Method and system for monitoring performance of distributed applications”, describes the tagging of transactions using mechanisms provided by the Application Response Measurement (ARM) standard. Thus a particular transaction can be tracked through each hop that does work for a particular transaction. The published application describes collecting response time data for each transaction at each hop and at some fixed interval all the transaction response time data is forwarded from the servers to a central point. Give the information collected, a complete view of each transaction can be constructed from the individual hop data for the transaction using the identification information provided by ARM. Therefore, given the knowledge of which specific machines that the transaction flowed through at each hop, it is possible to assemble a collection of only those transactions that flowed through a particular server and, using those transaction's end-to-end response time, a local performance index can be calculated.
This approach implements a mechanism to allow the selection of which transactions to instrument with ARM on the fly due to the overhead of maintaining all of the transaction records at each server, forwarding all of this information to a central point and the processing of all the records. It would be desirable, however, to be able to monitor all transactions efficiently without requiring transfer of the volume of data needed for this approach.