1. Technical Field
The present invention relates in general to performance monitoring and in particular to performance monitoring of virtual memory address translations. Still more particularly, the present invention relates to monitoring the performance of multi-hierarchical address translation in a processing system.
2. Description of the Related Art
In typical computer systems utilizing processors, system developers desire optimization of execution software for more effective system design. Usually, studies of a program's access patterns to memory and interaction with a system's memory hierarchy are performed to determine system efficiency. Understanding the memory hierarchy behavior aids in developing algorithms that schedule and/or partition tasks, as well as distribute and structure data for optimizing the system.
Performance monitoring is often used in optimizing software in a system. A performance monitor is generally regarded as a facility incorporated into a processor to monitor selected characteristics to assist in the debugging and analyzing of systems by determining a machine's state at a particular point in time. Often, the performance monitor produces information relating to the utilization of a processor's instruction execution and storage control. For example, the performance monitor can be utilized to provide information regarding the amount of time that has passed between events in a processing system. The information produced usually guides system architects toward ways of enhancing performance of a given system or of developing improvements in the design of a new system.
Current approaches to performance monitoring include the utilization of test instruments. Unfortunately, this approach is not completely satisfactory. Test instruments can be attached to the external processor interface, but these instruments cannot determine the nature of internal operations of a processor. Test instruments attached to the external processor interface cannot distinguish between instructions executing in the processor. Test instruments designed to probe the internal components of a processor are typically considered prohibitively expensive because of the difficulty associated with monitoring the many busses and probe points of complex processor systems that employ out of order execution, multiple pipelines, branch pre-detection, instruction prefetching, data buffering, and more than one level of memory hierarchy within the processors. A common approach for providing performance data is to change or instrument the software. This approach however, significantly affects the path of execution and may invalidate any results collected. It is known that in most processing systems, modification of the software significantly affects the path of execution of the processing system. Consequently, software accessible counters are incorporated into processors. Most software accessible counters, however, are limited in the amount of granularity of information they provide.
Further, a conventional performance monitor is usually unable to capture machine state data until an interrupt is signaled. Consequently, results may be biased toward certain machine conditions that are present when the processor allows interrupts to be serviced. Also, interrupt handlers may cancel some instruction execution in a processing system where, typically, several instructions are in progress at one time. Further, many interdependencies exist in a processing system, so that in order to obtain any meaningful data and to profile the state of the processing system must be obtained at the same time across all system elements. Accordingly, control of the sample rate is important because this control allows the processing system to capture the appropriate state. It is also important that the effect that the previous sample has on the sample being monitored is negligible to ensure the performance monitor does not affect the performance of the processor. Accordingly, a need exists for a system and method for effectively monitoring processing system performance that will efficiently and noninvasively identify potential areas for improvement.
In particular, in systems supporting virtual memory storage, a need exists for a method of monitoring the performance of effective-to-real address translations. Address translation performance information could be utilized to identify processing bottlenecks, to determine if processor resources are sufficient to support operation of a particular software program, and to determine what modifications to a software program's operation could improve efficiency. Such information may also be utilized during design of future processors to improve performance.