Task Dispatchers.
Various dispatchers and dispatching techniques have been developed for assigning tasks based on affinity with a processor or processor group in multiprocessor systems. In the field of multiprocessor computer systems, it is often desirable to intelligently assign tasks (e.g., that are to be performed for one or more application programs executing on the system) to particular one(s) of the processors in an effort to improve efficiency and minimize overhead associated with performing the tasks. For instance, as discussed further herein, it is often desirable to assign tasks that are most likely to access the same data to a common processor or processor group to take advantage of cache memory of the processors. For instance, by assigning tasks to the processor or processor group that has the most likely needed data already in local cache memory(ies), efficiencies may be gained.
It can be difficult to strike the right balance of work assignments between and among the processors of a multiprocessor system so that the computing tasks are accomplished in an efficient manner with a minimum of overhead. This appropriate balance may vary considerably depending on the needs of the system's users and to some extent upon the system architectures. It is often desirable to manage the assignment of tasks in a manner that does not require a majority of the available tasks to be assigned to a single processor (nor to any other small subset of all processors). If such an over-assignment of tasks to a small subset of all processors occurs, the small subset of processors is kept too busy to accomplish all its tasks efficiently while others are waiting relatively idle with few or no tasks to do. Thus the system will not operate efficiently. Accordingly, a management technique that employs a load balancing or work distribution scheme is often desirable to maximize overall system efficiency. Various such load balancing and work distribution schemes have been developed and employed for scheduling tasks in multiprocessor systems.
Multiprocessor systems are usually designed with cache memories to alleviate the imbalance between high performance processors and the relatively slow main memories. Cache memories are physically closer to their processors and so can be accessed more quickly than main memory. They are managed by the system's hardware and they contain copies of recently accessed locations in main memory. Typically, a multiprocessor system includes small, very fast, private cache memories adjacent to each processor, and larger, slower cache memories that may be either private or shared by a subset of the system's processors. The performance of a processor executing a software application depends on whether the application's memory locations have been cached by the processor, or are still in memory, or are in a close-by or remote processor's cache memory.
To take advantage of cache memory (which provides for quicker access to data because of cache's proximity to individual processors or groups of processors), it may be desirable to employ a task management scheme that assigns tasks based on affinity with a processor or processor group that has the most likely needed data already in local cache memory(ies) to bring about efficiencies. As is understood in this art, where a processor has acted on part of a problem (loading a program, running a transaction, or the like), it is likely to reuse the same data or instructions present in its local cache, because these will be found in the local cache once the problem is begun. By affinity we mean that a task, having executed on a processor, will tend to execute next on that same processor or a processor with fast access to the cached data. (Tasks begun may not complete due to a hardware interrupt or for various other well-understood reasons not relevant to our discussion).
Language in the computer arts is sometimes confusing as similar terms mean different things to different people and even to the same people in different contexts. Here, we use the word “task” as indicating a process. Tasks are sometimes thought of as consisting of multiple independent threads of control any of which could be assigned to different processor groups, but we will use the word task more simply, referring generically to a particular process.
The two above-mentioned desires of affinity and load balancing seem to be in conflict. Permanently retaining task affinity could lead to overloading some processors or groups of processors. Redistributing tasks to processors to which they have no affinity will yield few cache hits and slow down the processing overall. These problems get worse as the size of the multiprocessor computer systems gets larger.
Conventionally, computer systems use switching queues and associated algorithms for controlling the assignment of tasks to processors. Typically, these algorithms are considered an Operating System (OS) function. When a processor “wants” (is ready for) a new task, it will execute the (usually) re-entrant code that embodies the algorithm that examines the switching queue. This code is commonly called a “dispatcher.” It will determine the next task to do on the switching queue and do it.
While care should generally be taken in implementing a task assignment (or “scheduling”) framework that strikes a desired balance between affinity and load balancing, a task assignment scheme that includes some degree of affinity management is often desirable. Exemplary dispatchers and dispatching techniques for assigning tasks based on affinity with processor(s) are described further in the following U.S. patents: 1) U.S. Pat. No. 6,658,448 titled “System and method for assigning processes to specific CPUs to increase scalability and performance of operating systems;” 2) U.S. Pat. No. 6,996,822 titled “Hierarchical affinity dispatcher for task management in a multiprocessor computer system;” 3) U.S. Pat. No. 7,159,221 titled “Computer OS dispatcher operation with user controllable dedication;” 4) U.S. Pat. No. 7,167,916 titled “Computer OS dispatcher operation with virtual switching queue and IP queues;” 5) U.S. Pat. No. 7,287,254 titled “Affinitizing threads in a multiprocessor system;” 6) U.S. Pat. No. 7,461,376 titled “Dynamic resource management system and method for multiprocessor systems;” and 7) U.S. Pat. No. 7,464,380 titled “Efficient task management in symmetric multi-processor systems,” the disclosures of which are hereby incorporated herein by reference. While the above-incorporated U.S. patents disclose exemplary systems and dispatchers and thus aid those of ordinary skill in the art in understanding exemplary implementations that may be employed for assigning tasks based on affinity with processor(s), embodiments of the present invention are not limited to the exemplary systems or dispatchers disclosed therein.
Emulated Environments.
In some instances, it is desirable to emulate one processing environment within another “host” environment or “platform.” For instance, it may be desirable to emulate an OS and/or one or more instruction processors (IPs) in a host system. Processor emulation has been used over the years for a multitude of objectives. In general, processor emulation allows an application program and/or OS that is compiled for a specific target platform (or IP instruction set) to be run on a host platform with a completely different or overlapping architecture set (e.g., different or “heterogeneous” IP instruction set). For instance, IPs having a first instruction set may be emulated on a host system (or “platform”) that contains heterogeneous IPs (i.e., having a different instruction set than the first instruction set). In this way, application programs and/or an OS compiled for the instruction set of the emulated IPs may be run on the host system. Of course, the tasks performed for emulating the IPs (and enabling their execution of the application programs and/or OS running on the emulated IPs) are performed by the actual, underlying IPs of the host system.
As one example, assume a host system is implemented having a commodity-type OS (e.g., WINDOWS® or LINUX®) and a plurality of IPs having a first instruction set; and a legacy operating system (e.g., OS 2200) may be implemented on such host system, and IPs that are compatible with the legacy OS (and having an instruction set different from the first instruction set of the host system's IPs) may be emulated on the host system. In this way, the legacy OS and application programs compiled for the emulated IPs instruction set may be run on the host system (e.g., by running on the emulated IPs). Additionally, application programs and a commodity-type OS that are compiled for the first instruction may also be run on the system, by executing directly on the host system's IPs.
One exemplary area in which emulated IPs have been desired and employed is for enabling an OS and/or application programs that have conventionally been intended for execution on mainframe data processing systems to instead be executed on off-the-shelf commodity-type data processing systems. In the past, software application programs that require a large degree of data security and recoverability were traditionally supported by mainframe data processing systems. Such software application programs may include those associated with utility, transportation, finance, government, and military installations and infrastructures, as examples. Such application programs were generally supported by mainframe systems because mainframes generally provide a large degree of data redundancy, enhanced data recoverability features, and sophisticated data security features.
As smaller “off-the-shelf” commodity data processing systems, such as personal computers (PCs), increase in processing power, there has been some movement towards using those types of systems to support industries that historically employed mainframes for their data processing needs. For instance, one or more PCs may be interconnected to provide access to “legacy” data that was previously stored and maintained using a mainframe system. Going forward, the PCs may be used to update this legacy data, which may comprise records from any of the aforementioned sensitive types of applications.
For these and other reasons it has become popular to emulate a legacy OS (e.g., OS 2200 from UNISYS® Corp.) on a machine operating under the primary control of a commodity OS (e.g., LINUX®). In other words, IPs are emulated on a host system, and the legacy OS, as well as certain application programs, run on the emulated IPs. Exemplary emulated environments (e.g., with emulated IPs) are described further in: 1) U.S. Pat. No. 6,587,897 titled “Method for enhanced I/O in an emulated computing environment;” 2) U.S. Pat. No. 7,188,062 titled “Configuration management for an emulator operating system;” 3) U.S. Pat. No. 7,058,932 titled “System, computer program product, and methods for emulation of computer programs;” 4) U.S. Patent Application Publication Number 2010/0125554 titled “Memory recovery across reboots of an emulated operating system;” 5) U.S. Patent Application Publication Number 2008/0155224 titled “System and method for performing input/output operations on a data processing platform that supports multiple memory page sizes;” and 6) U.S. Patent Application Publication Number 2008/0155246 titled “System and method for synchronizing memory management functions of two disparate operating systems,” the disclosures of which are hereby incorporated herein by reference. While the above-incorporated disclosures provide exemplary emulated systems and thus aid those of ordinary skill in the art in understanding exemplary implementations that may be employed for emulating IPs on a host system and deploying an OS (e.g., a legacy OS) and/or application programs running on such emulated IPs, embodiments of the present invention are not limited to the exemplary systems or emulation techniques disclosed therein.