The present invention relates to monitoring software performance, and more specifically, to monitoring software performance including marking one of a load request and a store request, and tying one or more of an effective instruction and data addresses and an fabric response together in a sample.
Causes of contention issues on an inter-processor connection fabric are difficult to identify in a precise and accurate fashion. Looking for long-latency loads and stores may be inexact because there are many things that can happen that will cause latency to increase. For example, contention for resources in the memory controller, access to distant data within the system, and contention for address and data pathways between the requester and the source of the data. Processor cycle or instruction profiling may be used to reveal that time is being spent in locking routines but this does not provide data regarding cause for the delay. Finally, fabric traces, and analyzing for lock contention, are cumbersome to take on lab machines and generally not an option for customer machines. In addition, correlating the physical real address on the fabric trace back to a particular effective address in a process in an operating system image is a very difficult process. Further, the instruction address of the code that caused the contended address is not available.