Modern computer systems with many processors often have non-uniform memory access (NUMA) properties; that is, the cost of accessing data in memory is dependent on the physical location of the memory in relation to the processor which accesses it. As a result, performance improvements can often be gained by running an application on a limited number of processors and allocating memory which is local to those processors, thereby reducing or eliminating the need for costly remote memory accesses. Similarly, multiple threads which frequently access and modify areas of memory which are shared with other threads can benefit from keeping all users of that memory close together, to reduce the amount of cross-node traffic to obtain cache lines which exist in the cache of a remote processor. These two issues can be referred to as memory affinity and cache affinity.
Placing processes in order to increase the benefits of memory and cache affinity typically conflicts with the more general desire to balance work across all available resources of the whole system; clearly, placing all work onto a single node and allocating all memory locally will increase cache and memory affinity, but in general will not provide good performance for all workloads, due to the increased contention for resources on that node. It is therefore desirable to identify tasks which can benefit from memory and cache affinity and group them together, such that a group of related tasks will tend to run closer together, but that unrelated tasks may be placed across other parts of the system.
There are several existing techniques for identifying this grouping, all of which have drawbacks.
1. Have no automatic grouping of tasks performed by the operating system, but allow the user to group tasks and bind them to specific system resources. This approach relies heavily on the user understanding the behaviour of the workloads and the architecture of the system, and is both time consuming and error prone. Such manual bindings also typically restrict the operating system's load balancing capabilities, thus making it less responsive to changes in load.
2. Have the operating system attempt to group threads of the same process together, but treat processes as separate entities. This can provide significant benefit for some workloads, as threads of the same process will (in most operating systems) share the same address space and are likely to have considerable overlap in the working set of data used by the threads. However, this approach alone does not account for groupings of multiple processes, which means a significant potential benefit is not catered for.
What is required, therefore, is a means to identify groups of processes that can benefit from cache and memory affinity without suffering from these drawbacks.
It should be noted that the term “multiprocessor” as used herein encompasses dual- and multi-core processor devices, as well as multiple hardware thread and multiple CPU systems.
Systems exist which seek to address some of the above issues relates to a method for improving the execution efficiency of frequently communicating processes utilising affinity scheduling by identifying and assigning the frequently communicating processes to the same processor. The system is based on counting “wakeup” requests between two processors: a wakeup request occurs when a first process requiring information from a second process is placed in a sleep state until the second process is able to provide the required information, at which point the first process is awoken. A count of the number of wakeup requests between the pair of processes is maintained and, when a predetermined threshold is reached, the two processes are assigned to the same processor for execution. Whilst this allocation can improve performance, the determination is non-optimal, as will be described below.
It is therefore an object of embodiments of the inventive subject matter to provide a means for providing an improved allocation of processes to processors in a multiprocessor system and, in particular, a means capable of identifying and addressing potential conflict issues before they arise.