In the latter half of the twentieth century, there began a phenomenon known as the information revolution. While the information revolution is a historical development broader in scope than any one event or machine, no single device has come to represent the information revolution more than the digital electronic computer. The development of computer systems has surely been a revolution. Each year, computer systems grow faster, store more data, and provide more applications to their users. At the same time, the cost of computing resources has consistently declined, so that information which was too expensive to gather, store and process a few years ago, is no economically feasible to manipulate via computer. The reduced cost of information processing drives increasing productivity in a snowballing effect, because product designs, manufacturing processes, resource scheduling, administrative chores, and many other tasks, are made more efficient.
Early computer systems were isolated machines, in which data was input manually or from storage media and output generated to storage media or human perceptible form. While useful in their day, these systems were extremely limited in their ability to access and share information. As computers became more capable, and the ability to store vast amounts of digital data became prevalent, the desirability of communicating with other computer systems and sharing information became manifest. This demand for sharing information led to a growth of computer networks, including the Internet. It is now rare to find a general purpose computer system having no access to a network for communicating with other computer systems, although many special-purpose digital devices still operate in isolated environments.
More recently, this evolution of isolated computers to networked devices and shared information has proceeded to a further stage in digital data processing: the cloud. The “cloud” is in fact a collection of computing hardware and software resources which are accessible from remote locations to perform useful work on behalf of a client. However, except for the access point, such as a client computer terminal having limited capability, the client does not own or control hardware and software resources which provide computing services in the cloud. The cloud presents a virtualized system having the capability to provide whatever computing services are required. The client contracts to obtain the computing services. These services are provided by the virtualized system, i.e., without any specification of the particular physical computer systems which will provide the contracted service. This virtualization enables a provider of services in the cloud to re-allocate the physical computer resources as convenient, without involvement of the client. Cloud computing has thus been analogized to an electric utility, in which the customer purchases electric power without any knowledge or concern how the power is generated. Cloud computing not merely enables communications with other computing devices and access to remotely stored data, but enables the entire computing task to be performed remotely.
Although some of the concepts used in cloud computing date back to the 1950's and early time sharing systems, the use of cloud computing in a global networked environment is a relatively new phenomenon, and deployment of cloud computing on a broad scale is still in its early stages. New challenges arise as designers attempt to implement the broad concepts of cloud computing in functioning systems. Among the challenges of cloud computing is the efficient allocation of cloud resources.
For any of various reasons, it is often necessary or desirable to migrate workload in one computer system (a source) to another computer system (a target). Often, workload migration takes the form of migrating one or more logical partitions from the source to the target, the migrated partition's workload previously being performed in the source being subsequently performed in the target. For example, each client of a server may have its own logical partition within the server for one or more respective client processes, so workload is migrated by moving the workload of one or more clients, and reconstructing the partition parameters, on one or more other server systems. A partition may be migrated to balance workload among multiple systems, but may also be migrated to perform maintenance on the source system or for some other reason.
Physically, many large server systems are designed as systems having a non-uniform memory access (NUMA), in which multiple processors and main memory are physically distributed, so that each processor has some portion of main memory which is in closer physical proximity (and is accessed faster) than other portions of main memory. In such a system, it is desirable, insofar as possible, to hold instructions and other data required for executing a process or thread in the main memory portion which is physically closest to the processor executing the process or thread, a characteristic referred to as “processor-memory affinity” or “affinity”.
Conventional cloud and multi-system management tools do not always optimally manage workload in a complex multi-server environment. With the growth in cloud computing and other forms of shared and distributed use of computing resources, a need exists for improved techniques for managing workload among multiple systems, and in particular, for managing the migration of workload from a source server system in a multi-server environment to one or more target server systems.