The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices that may be found in many different settings. Computer systems typically include a combination of hardware (e.g., semiconductors, circuit boards, etc.) and software (e.g., computer programs). As advances in semiconductor processing and computer architecture push the performance of the computer hardware higher, more sophisticated computer software has evolved to take advantage of the higher performance of the hardware, resulting in computer systems today that are much more powerful than just a few years ago. One significant advance in computer technology is the development of parallel processing, i.e., the performance of multiple tasks in parallel.
A number of computer software and hardware technologies have been developed to facilitate increased parallel processing. From a hardware standpoint, computers increasingly rely on multiple microprocessors to provide increased workload capacity. Furthermore, some microprocessors have been developed that support the ability to execute multiple threads in parallel, effectively providing many of the same performance gains attainable through the use of multiple microprocessors. From a software standpoint, multithreaded operating systems and kernels have been developed, which permit computer programs to concurrently execute in multiple threads, so that multiple tasks can essentially be performed at the same time.
In addition, some computers implement the concept of logical partitioning, where a single physical computer is permitted to operate essentially like multiple and independent virtual computers, referred to as logical partitions, with the various resources in the physical computer (e.g., processors, memory, and input/output devices) allocated among the various logical partitions via a partition manager, or hypervisor. Each logical partition executes a separate operating system, and from the perspective of users and of the software applications executing on the logical partition, operates as a fully independent computer.
Because each logical partition is essentially competing with other logical partitions for the limited resources of the computer, users are especially interested in monitoring the partitions in order to ensure that they are achieving satisfactory performance. A performance data collection tool that collects detailed performance metrics is often used for this purpose. One common performance data collection tool is an instruction address sampler within a processor that captures instruction addresses at preset intervals of processor cycles.
When sampling instruction addresses, the operating system will typically receive control from the processor via an interrupt mechanism upon the expiration of a sampling interval. The operating system then records the sample data (e.g., instruction addresses) and then initializes the next sample interval. To effectively analyze the performance of the computer system that contains multiple logical partitions running varied operating systems, users often desire to obtain a system-wide collection of instruction address samples. Unfortunately, capturing samples and setting fixed sample intervals across the entire system, including samples within the hypervisor, is problematic.
A first problem is that the hypervisor saves and restores the state of a physical processor when it allocates that processor to various partitions, which causes the state of the sampling logic in each processor to be specific to each partition and also hides the contribution of the hypervisor itself. A second problem is that the interrupt generated by the expiring interval is only seen by one particular partition, and not the hypervisor (the hypervisor operates with interrupts disabled), which makes system-wide performance data collection difficult. A third problem is that a partition is only aware of the processors allocated to it and is not aware of the existence of the physical processors of the entire computer system, which further makes a system-wide performance data collection difficult. These problems impair the usefulness of the performance data collected by instruction address samplers in a logically-partitioned system.
Hence, without a better technique for collecting performance data in logically-partition systems, users will continue to experience difficulty in performing performance analysis.