Field of the Invention
The present invention relates generally to software based computer system simulators and, more particularly, to enabling faster simulation performance by analyzing the past behavior of simulated instructions to determine when it is necessary to execute code that is more costly in terms of performance.
The use of simulation systems for developing computer systems and software has shown to provide tremendous advantages in terms of timesavings and cost. More recently, the technique of full system simulation has gained more attention as advances in processor speeds of contemporary host computer platforms enable good simulation of detailed hardware models in computer system architectures. System simulation provides a virtual computer with the capability to create functional models of CPUs with supporting chips in sufficient detail to execute native object code without modification. Furthermore, simulation provides the significant advantage of enabling developers to develop and test their software applications and hardware products on high-end systems without having to use actual hardware with its associated expense or unavailable hardware still under development.
Simulation offers benefits for software development where it can arbitrarily parameterize, control, and inspect the computer system it is modeling and provide measurements that are non-intrusive and deterministic. It also provides a basis for automation where multiple simulator sessions can run in parallel and sessions can be fully scripted using performance monitors. For example, the software can be combined with the virtual systems in a way that facilitates modeling of the processor, disk access, memory access, cache configuration, and memory system bus and other parameters. With full system simulation, the operating system, device drivers, and application software cannot tell whether they are running on real hardware or in the simulated environment. Maximum flexibility is obtained using the simulated environment with the ability to change the parameters to suit a particular testing environment while performing comprehensive data collection and execution and performance tracking.
A critical issue relating to simulation systems is that of performance i.e. a tradeoff must typically be made between the accuracy of simulating the system as closely as possible versus a performance level that is acceptable. Creating a more realistic simulation workload environment generally comes at the expense of time and cost to deal with specification inaccuracies and implementation errors. One area that has a significant affect on performance relates to memory accesses. Memory accesses are operations that are typically performed very frequently thus any improvement in access times can significantly improve simulation performance. By way of example, when simulating an instruction set of a modern processor, simulation of memory accesses is a key component of the simulator efficiency, since memory accesses are generally very frequent operations that e.g. can consist of somewhere between one fifth to one third of all operations. The simulated processor has a memory management unit (MMU) which checks that the addresses may be accessed and translates the virtual memory addresses to physical memory addresses. For every memory access, this check and translation is performed, which when normally done in hardware has an insignificant affect on performance but greatly affects the simulation performance.
Modern processors often employ virtual memory page based translation. In some cases, other mechanisms such as segmentation are used in addition to paging. A page is a fixed size, aligned address range with a specific translation from virtual to physical address. The page is also used to specify the access rights i.e. if the addresses can be read, written or executed. Thus a page is the smallest address range which can have a specified virtual to physical address translation and can have a specified access protection. Some architectures allow so-called large pages, i.e. pages of different sizes where the larger page size is an even multiple of the smaller page sizes.
Conventional simulators have either used the hardware support from the host or a pure software algorithm to perform the virtual to physical address translation. When doing it in software, a hash lookup is typically performed first. The hash lookup typically requires many simulated instructions thereby affecting performance if done too frequently. If the lookup hits, which is the common fast case, the access is guaranteed not to violate any access restrictions and the lookup table returns the address where simulated memory is stored. If the lookup misses, a slow path is used that can handle all the uncommon cases that can happen to a memory access, such as a TLB miss, access violation, and miss alignment. A description of such a lookup algorithm can be found in REF1 i.e. the article “Efficient Memory Simulation in SimICS” by P. S. Magnusson and B. Werner, in Proceedings of the 28th Annual Simulation Symposium, 1995, However, even if the lookup hits it is costly in terms of performance cost to perform it for every memory access.
A simulator running as a user process on a conventional OS cannot use hardware support to perform translations. To this the process must be given OS privileges or the OS must be modified to allow the simulator more control of the host hardware. Typically, the simulator wants to control the host translation look-aside buffer (TLB) and/or the segmentation hardware to support the simulated translations.
In view of the foregoing, it is desirable to provide a commercial quality level simulation platform that offers improved simulator performance in order to more accurately model workloads by running unmodified code in realistic configurations.