1. Field of the Present Invention
The present invention generally relates to the field of data processing systems and more particularly to an application for monitoring and graphically displaying memory transactions in a distributed memory system.
2. History of Related Art
The use of multiple processors to improve the performance of a computer system is well known. In a typical multi-processor arrangement, a plurality of processors are coupled to a system memory via a common bus referred to herein as the system or local bus. The use of a single bus ultimately limits the ability to improve performance by adding additional processors because, after a certain point, the limiting factor in the performance of a multiprocessor system is the bandwidth of the system bus. Generally speaking, the system bus bandwidth is typically saturated after a relatively small number of processors have been attached to the bus. Incorporating additional processors beyond this number generally results in little if any performance improvement.
Distributed memory systems have been proposed and implemented to combat the bandwidth limitations of single bus systems. In a distributed memory system, two or more single bus systems referred to as nodes are connected to form a larger system. Each node typically includes its own local memory. One example of a distributed memory system is referred to as a non-uniform memory architecture (NUMA) system. A NUMA system is comprised of multiple nodes, each of which may include its own processors, local memory, and corresponding system bus. The memory of each node is accessible to each other node via a high speed interconnect network that links the various nodes. The use of multiple system busses (one for each node) enables NUMA systems to employ additional processors without incurring the system bus bandwidth limitation experienced by single bus systems. Thus, NUMA systems are more suitably adapted for scaling than conventional systems.
In a NUMA system, the time required to access system memory is a function of the memory address because accessing memory local to a node is faster than accessing memory residing on a remote node. In contrast, access time is essentially independent of the memory address in conventional SMP designs. Software optimized for use on conventional machines may perform inefficiently on a NUMA system if the software generates a large percentage of remote memory accesses when executed on the NUMA system. The potential for performance improvement offered by scaleable NUMA systems may be partially offset or entirely negated if, for example, the paging scheme employed by the NUMA system allocates a code segment of the software to the physical memory of one node and a data segment that is frequently accessed by the processors of another node. Due to variations in memory architecture implementation, paging mechanisms, caching policies, program behavior, etc., tuning or optimizing of any given NUMA system is most efficiently achieved with empirically gathered memory transaction data. Accordingly, mechanisms designed to monitor memory transactions in NUMA systems are of considerable interest to the designers and manufacturers of such systems. Hardware mechanisms suitable for gathering memory transaction information in a NUMA system have are disclosed in the above referenced patent applications. To take full advantage of the information the monitoring hardware is capable of gathering, it is desirable to implement an elegant and powerful user interface that enables the user to capture, display, and analyze information provided by memory transaction monitoring hardware.
The problem identified above is addressed by a system for and method of monitoring memory transactions in a data processing system. The method includes defining a set of memory transaction attributes with a monitoring system and detecting, on a data processing system connected to the monitoring system, memory transactions that match the defined set of memory transaction attributes. The number of detected memory transactions occurring during a specified duration are then displayed in a graphical format. In one embodiment, the data processing system comprises a non-uniform memory architecture (NUMA) system comprising a set of nodes. In this embodiment, the detected transactions comprise transactions passing through a switch connecting the nodes of the NUMA system. The set of memory transaction attributes may include memory transaction type information, node information, and transaction direction information. The data processing system may operate under a first operating system such as a Unix(copyright) based system while the monitoring system operates under a second operating system such as a Windows(copyright) operating system. The set of memory transactions may include memory address information. In this embodiment, defining the memory address information may include defining a memory window size, subdividing the memory window into a set of memory grains, and displaying the number of detected memory transactions corresponding to each memory grain in the memory window.
The invention further contemplates a system for monitoring memory transactions on a data processing system such as a NUMA system. The system includes a processor, a device driver configured to receive memory transaction information from a switch connecting the nodes of the NUMA system, and user code configured to enable a user to define a set of memory transaction attributes. The user code is further suitable for displaying the number of memory transactions matching the defined set of memory attributes during a specified duration. The device driver and user code may execute under a first operating system while the NUMA system is operating under a second operating system. The set of memory transaction attributes may include memory transaction type information, memory transaction direction information, and memory transaction node information. The set of memory transaction attributes may include memory address information.