1. Field of the Invention
The present invention relates to the process of measuring the performance of computer systems. More specifically, the present invention relates to a method and an apparatus for dynamically characterizing computer system performance by varying multiple input variables simultaneously.
2. Related Art
Current methods for qualification testing of enterprise computing systems involve putting a maximum expected load on one or more input variables, and seeing if the system crashes. While this type of qualification testing is necessary, dynamical system characterization can provide a far greater wealth of information that is useful for designing robust systems that deliver optimal Quality-of-Service (QOS) performance over a large range of system parameters.
Dynamical system characterization involves introducing perturbations in one or more xe2x80x9cinputxe2x80x9d variables, and measuring the time-dependent responses in one or more xe2x80x9cresponsexe2x80x9d variables. Typical input variables of interest can include those associated with load on the CPU, load on the memory, or I/O traffic to disk. Typical response variables of interest can include physical variables such as local temperatures, voltages, or currents; or a large range of what are called Quality-of-Service (QOS) variables. QOS variables can also include a variety of xe2x80x9cwait timesxe2x80x9d that users experience when interacting with large Web centers. For example, QOS variables may include: the wait time after clicking on a shopping cart; the time to return a database query; or the time to process an FTP request to download a file.
Dynamical system characterization asks, xe2x80x9cFor a perturbation of x % in input variable A, what is the variation in response variable B?xe2x80x9d We quantify this relationship with a xe2x80x9cdynamic coupling coefficientxe2x80x9d between A and B. This coupling coefficient, C, may be a function of load, or, more generally, may be a multivariate function of multiple input variables. There may be a linear relationship between input A and response B; or the relationship may be highly nonlinear.
One can make an analogy with structural mechanics for airplanes. Static load testing may show that structural elements are strong enough to withstand all expected loads for a plane. However, resonances may exist wherein small vibrations are nonlinearly amplified, leading to catastrophic failures. Early airplane designers learned that it is crucially important to do dynamic response testing to learn if such structural resonances exist.
For computing systems, numerous phenomena have been identified that can lead to nonlinear coupling between small fluctuations in input variables and response variables. For example, if memory usage is near the limit of available memory, applications will swap to disk, a process that is much slower. For this reason, when stochastic load variations occur with lots of available memory remaining, coupling with QOS response variables is linear, has small coupling coefficients, and has extremely small phase-shifts (lag times) between input and response variables. However, when the same stochastic load variations occur near the limit of available memory, there is a strongly nonlinear coupling between input and response variables, and with a measurable phase-shift.
Similar nonlinear coupling between input variables and response variables has been observed in online transaction processing (OLTP) systems with a phenomenon called latch contention. Moreover, in networked systems that are bandwidth constrained, packet collisions can lead to non-linearities in QOS latencies. Hence, there are many phenomena within computer systems that can lead to nonlinear dynamical interaction effects between input variables and a multitude of physical and QOS response variables.
Hence, what is needed is a method and an apparatus for dynamically characterizing computer system performance.
Although existing tools can measure and log these QOS variables as a function of time, there are no tools that permit the diagnosis of interaction effects between and among the various QOS variables. A system analyst may be interested in the answers to questions such as: (1) If I have 10,000 email users this month, and I add 2,000 new email accounts next month, how will the increased email traffic impact my database users"" wait times? (2) What impact do 8,000 animated browser banners have on download latency times for my 3,000 active file transfer protocol (FTP) usersxe2x80x94and vice versa? (3) What impact do 4,000 active portable document format (PDF) file viewers have on transaction processing system (TPS) throughput? In general, what impact does the instantaneous demand for performance variable X produce in performance variable Y?
Attempting to measure the impact that one variable has on another variable is difficult, at best. One way of attempting to measure this interaction is to create a step increase in the number of transactions of one category and search for a change in the response, or wait, time in transactions of another category. The ability to find a change in the wait time in a second category of transaction requires the step increase in the first category of transaction to be relatively large. However, using a large step increase is undesirable because the large step increase may adversely affect the response time in the first variable at the same time. Additionally, if the step increase is too large, the increase can cause the system to fail, or xe2x80x9ccrash.xe2x80x9d
Hence, what is needed is a method and an apparatus to quantify the interaction of QOS variables.
Another problem is that the process of dynamically characterizing computer system performance can be extremely time-consuming. For example, a test involving a single input variable can require up to several hours. Consequently, a test involving multiple input variables can potentially take many days, which is an unacceptable amount of time in most testing contexts.
What is needed is a method and an apparatus for dynamically characterizing computer system performance through tests involving multiple input variables without requiring an undue amount of time.
One embodiment of the present invention provides a system that dynamically characterizes computer system performance. The system operates by simultaneously varying multiple input variables in the computer system, and gathering performance results by measuring time-dependent responses in one or more response variables. In this way, responses to variations in the multiple input variables can be measured simultaneously. Next, the system analyzes the performance results to determine correlations between input variables and output variables.
In a variation on this embodiment, varying the multiple input variables involves using the concentric-hypersphere perturbation technique to generate very nearly ideal sinusoidal impulsional perturbations in the multiple input variables simultaneously.
In a variation on this embodiment, varying the multiple input variables involves varying a single input variable using a composite periodic disturbance comprising multiple distinct frequencies.
In a variation on this embodiment, varying the multiple input variables involves using multiple amplitudes for varying a single input variable.
In a variation on this embodiment, analyzing the performance results involves performing a normalized cross power spectral density (NCPSD) analysis on the performance results.
In a variation on this embodiment, simultaneously varying the multiple input variables involves generating a pattern of synthetic transactions, and then sending the pattern of synthetic transactions to the computer system to be processed. (Note that xe2x80x9csynthetic transactionsxe2x80x9d are transactions that are generated for the performance measurement purposes, and are not part of the computer system""s normal workload.)
In a further variation, the pattern of synthetic transactions is sent to the computer system while the computer system is processing a normal workload, so that the pattern of synthetic transactions is added to the normal workload.
In a further variation, the pattern of synthetic transactions includes a varying pattern of synthetic transactions.
In a further variation, the pattern of transactions additionally includes a fixed pattern of synthetic transactions.