Computing systems have become an integral part of business, government, and most other aspects of modern life. Most people are likely regrettably familiar with poor performing computer systems. A poor performing computer system may be simply poorly designed and, therefore, fundamentally incapable of performing well. Even well-designed systems will perform poorly, however, if adequate resources to meet the demands placed upon the system are not available. Properly matching the resources available to a system with the demand placed upon the system requires both accurate capacity planning and adequate system testing to predict the resources that will be necessary for the system to function properly at the loads expected for the system.
Predicting the load that will be placed upon a system may involve a number of issues, and this prediction may be performed in a variety of ways. For example, future load on a system may be predicted using data describing the historical change in the demand for the system. Such data may be collected by monitoring a system or its predecessor, although such historical data may not always be available, particularly for an entirely new system. Other methods, such as incorporating planned marketing efforts or other future events known to be likely to occur, may also be used. The way in which system load is predicted is immaterial to the present invention.
Regardless how a prediction of future system load is made, a system must have adequate resources to meet that demand if the system is to perform properly. Determining what amount of resources are required to meet a given system demand may also be a complex problem. Those skilled in the art will realize that system testing may be performed, often before a system is deployed, to determine how the system will perform under a variety of loads. System testing may allow system managers to identify the load at which system performance becomes unacceptable, which may coincide with a load at which system performance becomes highly nonlinear. One skilled in the art will also appreciate that such testing can be an enormously complex and expensive proposition, and will further realize that such testing often does not provide accurate information as to at what load a system's performance will deteriorate. One reason for the expense and difficulty of testing is the large number of tests necessary to obtain a reasonably accurate model of system performance.
One skilled in the art will likely be familiar with the modeling of a system's performance as a linear function of load. One skilled in the art will further realize, however, that a linear model of system performance as a function of load is often a sufficiently accurate depiction of system performance within only a certain range of loads, with the range of loads within which system performance is substantially linear varying for different systems. System performance often becomes non-linear at some point as the load on the system increases. The point at which system performance becomes nonlinear may be referred to as the point at which the linear model breaks down. The load at which a system's performance begins to degrade in a non-linear fashion may be referred to as the knee. At the knee, system throughput increases more slowly while response time increases more quickly. At this point system performance suffers severely, but identifying the knee in testing can be difficult. Accordingly, while a basic linear model theoretically can be obtained with as little as two data points, additional data points are necessary to determine when a linear model of system performance will break down. Obtaining sufficient data points to determine when a linear model of system performance breaks down often requires extensive testing. At the same time, such testing may not yield an accurate model of system performance, particularly as the system moves beyond a load range in which its performance is substantially linear.
The collection of system metrics in a production environment may be used to monitor system performance. System metrics collected in a production environment may also be used to model system performance. However, linear modeling of system performance using system metrics collected in a production environment will not be likely to yield a better prediction of the system's knee unless the system operates at or beyond that point. Of course, one skilled in the art will appreciate that the purpose of system testing and system modeling is to avoid system operation at and beyond the knee, meaning that if such data is available the modeling and monitoring has already substantially failed.
A further challenge to using system metrics collected in a production environment is the burden of collecting the metrics. Simply put, collecting system metrics consumes resources. The system to be monitored, and/or associated systems operating with it, must measure, record, and process metrics. Particularly when a system is already facing a shortage of resources, the increased cost of monitoring the system's metrics must occur in an efficient fashion and provide significant benefit to be justified.