1. Field
The present invention relates to migration of computing processes, in particular to when the processes are running in parallel on a distributed computing system with distributed memory.
2. Description of the Related Art
The invention has practical applications in particular in the area of computer programs which use distributed memory and exchange information dynamically. Many of these programs use distributed processing and memory which is divided to correspond to individual elements, for which some form of computation is required.
One example of such a distributed parallel application is a computer program monitoring a sensor network, or a communications network. Each sensor in a sensor network can be viewed as an individual element requiring computation, for instance to process sensed values and/or to determine characteristics such as overall processing load. The computations can be carried out by a computing program which monitors the sensor network, and can also control the sensor network.
In a communications network, each entity (such as a mobile terminal or user equipment (UE), base station or relay station) can also be viewed as an element requiring computation, for example to determine an overall load. The computations can be carried out by a computing program which monitors and/or controls the communications network.
A further example is in monitoring stock trading, for instance to analyse the data for illegal trading practices. Computation may be required to track the transactions of each trader.
A yet further example is simulation. In many simulations, an iterative computation or iterative sets of computations are carried out, each computation corresponding to a single element in the simulation. Simulation elements may be linked in that a computation for one element of the simulation may require values from other elements of the simulation, so that data transfer between processes carrying out the simulation is considerable. Computer programs carrying out such simulations require the workload associated with the computations to be allocated to suitable computing resource, for example within a distributed system.
In these computer programs and other computer programs with linked computations, there may be a requirement to migrate processes. For example, it may be necessary to move the entire execution to a new computer system or new part of a computer system to allow for changes in resource utilization by the computer program itself or by other computer programs. For example, in a sensor network, an emergency event (such as an earthquake) may lead to a sudden requirement for more resource to allow quicker processing. In telecommunications, the conditions during peak periods and off-peak periods of usage might be better monitored using different systems. Also, data input for trading analysis, or trading monitoring, must be updated instantaneously (or at least within a short time frame). A surge in trading may require the use of extra resources.
Execution of the computer program itself may require a change in resource within the same system, for instance due to development of factors within the program or to external considerations.
As mentioned previously, there are many computer programs with individual elements requiring individual computation, some also having potential effect on other elements of the simulation. Two examples requiring a high level of communication between elements are use of finite element and finite volume methods to simulate material characteristics, including those of both fluids and solids.
Taking computational fluid dynamics (CFD) as an example, this technique uses numerical methods and algorithms to solve and analyze problems that involve fluid flows. There are many approaches to CFD modeling and other three-dimensional algorithmic modeling, but the same basic three-stage procedure is almost always followed.
During pre-processing, the geometry (physical bounds) of the problem is defined; and the volume occupied by the fluid or other material is divided into discrete cells or nodes (the mesh). The mesh may be uniform or non uniform and its division into cells or nodes may be adaptive, to change the mesh size as appropriate during simulation. The physical modeling is defined using appropriate equations and boundary conditions are defined. In CFD this involves specifying the fluid behavior and properties at the boundaries of the problem. For transient problems, the initial conditions are also defined.
In processing the simulation is started and the equations are solved iteratively on a pe-cell/per-node basis, as a steady-state or transient.
Finally a postprocessor is used for the analysis and visualization of the resulting solution.
The data for each mesh node or discrete cell can be viewed as a single element in the simulation.
Another example of a computer simulation is agent modeling (also referred to as agent-based modeling) in which individuals can be viewed as elements of a simulation.
An agent-based model (ABM) (also sometimes related to the term multi-agent system or multi-agent simulation) is a computational model for simulating the actions and interactions of autonomous agents with a view to assessing their effects on the system as a whole. In many models, each agent is an individual (person, animal or other autonomous element). In order to simulate the individual's behavior, the individual is given attributes, such as a moveable position and rule-based reactions to stimuli, including other individuals.
A further example of simulation is particle simulation, which simulates a dynamic system of particles, usually under the influence of physical forces such as gravity. Each particle may be viewed as a single element in such a simulation.
Computationally intense applications like these are often carried out on high performance computer systems. Such high performance computer (HPC) systems often provide distributed environments in which there is a plurality of processing units or cores on which processing threads of an executable can run autonomously in parallel.
Many different hardware configurations and programming models are applicable to high performance computing. A popular approach to high-performance computing currently is the cluster system, in which a plurality of nodes each having one or more multicore processors (or “chips”) are interconnected by a high-speed network. Each core is assumed to have its own area of memory. The cluster system can be programmed by a human programmer who writes source code, making use of existing code libraries to carry out generic functions. The source code is then compiled to lower-level executable code, for example code at the ISA (Instruction Set Architecture) level capable of being executed by processor types having a specific instruction set, or to assembly language dedicated to a specific processor. There is often a final stage of assembling or (in the case of a virtual machine, interpreting) the assembly code into executable machine code. The executable form of an application (sometimes simply referred to as an “executable”) is run under supervision of an operating system (OS).
Applications for computer systems having multiple cores may be written in a conventional computer language (such as C/C++ or Fortran), augmented by libraries for allowing the programmer to take advantage of the parallel processing abilities of the multiple cores. In this regard, it is usual to refer to “processes” being run on the cores.
In cluster systems and in other distributed memory system, migration of an execution can require synchronization of data.
To assist understanding of the invention to be described, some relevant considerations are set out below, using simulation as an example.
A “fixed number of processors” model assumes that the workload to be distributed, the priority of the job (i.e. how urgently the results are required) and the system on which the job is running will remain constant over the duration of the simulation. However, this may not be the case.
In the example of an adaptive finite element simulation, the number of mesh nodes in the simulation may vary by one or more orders of magnitude over the course of the simulation. During the simulation, the mesh may be partitioned so that the data relating to different nodes are stored in memory associated with different processors. The processor “owns” the nodes whose data are primarily stored in its local memory. Nodes from which data is required for computation of values for any particular owned node are known as “halo nodes” to that processor.
When the number of mesh nodes varies, the number of mesh nodes allocated to each processor of a distributed system may become very low compared to the number of halo nodes per processor at some stage of the simulation. This can lead to a very high communication-to-computation ratio because data from the halo nodes is required for simulation of the mesh nodes. In this case, it may become faster to run the simulation on a smaller number of processors (reducing the communication).
Alternatively, changed priorities may make it desirable to allocate extra resources to a given job on the HPC system or move the job to an entirely new system. For example, a real-time disaster warning system may be running as a monitoring network or as a simulation at low priority on a small subset of the available resources when a sudden event—e.g. an earthquake—perturbs the system that it is simulating, requiring rapid monitoring or simulation of the consequences using as many resources as can be made available. Simultaneously, other jobs must be scaled down to smaller systems in order to make way for the high priority job.
Further, a response to a failure in one of n processors allocated to a job might be to migrate the execution to run on (n−1) processors (rather than terminating the job entirely).
It is therefore desirable to provide a way of migrating execution of a computer program that can be used flexibly for different circumstances and that takes into account the difficulty of migration and synchronization.