The present invention relates to a method and system implementing feedback algorithms for controlling a given simulation model. In particular, it relates to a computer-program-based method and system for providing a feedback control for a given set of control quantities of a simulation model, comprising a plurality of iterated simulation runs in which a single simulation run consumes a considerable amount of time.
The present invention will be next defined from prior art on the specific field of computer system modeling. In particular, the invention will be illustrated by an exemplary application to a large scale multi-processor (MP) performance model with special focus on storage hierarchy (SH) performance.
In the inventive context the term model refers to a time-dependent simulation model in contrast to an analytical spread sheet model: Simulation models are true cycle-by-cycle models of computer systems. In particular, queuing delays and resource utilizations are a result of simulated data transfer. Analytical models instead use mathematical formula to calculate queuing delays from given rates and utilizations.
In large-scale compute environments, it is generally recommended that central processor (CP) utilization should not exceed 90%. Otherwise, excessive queuing delays for shared system resources can severely impact system performance. Therefore, performance benchmarks used to test real hardware are also usually performed at the same CP utilization of 90%.
Consequently, simulation models of real hardware must be able to accurately simulate any given CP utilization. When a multi-processor model ignores CP utilization and hence runs at a utilization of 100% instead of 90%, the load on the storage hierarchy cooperating closely with said CPs is significantly overstated and not representative for real customer environments. Sample model runs ignoring CP utilization turned out to mispredict total system capacity by as much as 7.5%. A misprediction of that magnitude can lead to wrong and hence costly design decisions and—even worse—to a wrong positioning in the marketplace.
The difficulty to correctly model a given CP utilization results from the fact that CP busy time and hence CP utilization includes both pure instruction processing time and the time in which the CP is waiting for storage hierarchy (SH) requests to be resolved. The wait time in turn depends on the load of the SH and thus indirectly on CP utilization. As this functional relationship is unknown, prior art computer simulation models often ignore CP utilization completely and operate constantly at a utilization of 100%—with the exposures and consequences as mentioned above.
The inventive approach presented here applies both to computer system models driven by instruction traces and driven by event rates.
In models driven by instruction traces, the simulated CPs read, interpret and execute instruction sequences. In this case, CP utilization can be simulated by suspending instruction processing for a while at adequate points in the instruction stream—that is by artificially inserting periods of user think time. Event driven models instead have drivers implemented which statistically generate requests to the storage hierarchy. Typical requests are L1 cache misses. Their frequency is an input parameter to the model and CP utilization is reflected in the interarrival time of events which is the average number of processor cycles between two events: The higher the CP utilization, the more requests are issued per unit of time.
In both model types, the leverage to manipulate a CP's utilization is its relative utilization during the time in which it is not waiting for a SH request to be resolved. This time is referred to herein as the CP's entry utilization χ. Then, the total utilization μ is a workload dependent and system dependent function of χ, μ=μ(χ) As mentioned above, μ includes CP wait times and hence is unknown, and the underlying utilization modeling problem consists in choosing χ such that μ=μ(χ) converges against the utilization aimed at. This utilization quantity is referred to herein as the target utilization u.
As mentioned before, prior art computer simulation models often ignore CP utilization completely. If, however, such prior art involves the predetermined setting of CP utilization, the most obvious approach to make μ converge against u is the so-called regulation technique.
Said prior art regulation techniques use the following iterative approach which consists of:    a) defining a start value for the entry control quantity,    b) performing a current run of simulation with this start value, thus yielding a resulting value for the target control quantity,    c) comparing the resulting target quantity to its value aimed at,    d) defining a new starting value in dependence of the existing result, and    e) performing a next run of simulation with this new starting value.
Of course, instead of being 1-dimensional, the starting value as well as the control quantity can be a multi-dimensional vector reflecting the multiple dimensions in a simulation system.
In prior art, the algorithm to choose the entry utilization works as follows: Starting with some entry value χ=χ′1, the simulation model runs for a while, then determines the simulated total utilization u1 and chooses a new value for χ to be selected for the next iteration of the simulation run: If u1<u, then χ′2>χ′1 is chosen and vice versa. χ denotes the instantaneous value of χ valid for the next iteration. The quality and speed of convergence depend on the step-sizes chosen for the difference between χ′2 and χ′1 and on the initial value chosen for χ and may even vary with the value of the target utilization u. In particular, this heuristical method is exposed to over-correct χ and hence may lead to an undesired wide oscillation of the simulated utilization around u.
Disadvantageously, prior art provided feedback control algorithms may even become complex, as soon as the convergence history is used to determine the starting value of χ for the next iteration of the simulation run. Thus, the drawback of prior art is that:    a) it converges too slowly, which implies long-lasting simulation runs,    b) it is exposed to an undesired oscillation around the target utilization.