This disclosure relates to the computer simulation of physical objects and their interactions. In particular, but not by way of limitation, this disclosure relates to methods of domain decomposition for load balancing such that computational tasks among available computer nodes are distributed evenly for fastest completion.
Computer simulation is essential to many new developments in science and technology. Many physical systems or objects and their interactions are simulated by computers rather than studies by physical experiments due to various considerations. For example, in nuclear weapon design and testing, nuclear device detonation is banned by international treaties. To test the function, safety or effectiveness of a nuclear device, detonations are simulated by computers. In numerical weather forecasting, by definition, the forecast has to be pronounced before the weather condition actually occurs. The progression of weather changes is simulated by numerical computer simulations. In oil and gas exploration, seismic surveys are conducted to find possible oil and gas reservoirs. The resulting seismic survey data are enormous in both quantity and complexity. To understand the data, which can reveal the interactions of seismic waves and the underlying Earth structures, computer simulations are used.
Computer simulations, parallel computing in particular, are widely used for large scale simulation of complex systems, some of which are mentioned above. For simplicity of discussion, simulation of seismic wave propagation is used as an example in the discussion below. Seismic waves can be characterized by the partial differential equations:
            ρ      ⁢                        ∂                      v            i                                    ∂          t                      =                            ∂                      σ            ij                                    ∂          j                    +              F        i              ,                    ∂                  σ          ij                            ∂        t              =                  1        2            ⁢                        c          ijkl                ⁡                  (                                                    ∂                                  v                  k                                                            ∂                l                                      +                                          ∂                                  v                  l                                                            ∂                k                                              )                    where indices i, j, k, l represent a component of a vector or a tensor field in Cartesian coordinates (x, y, z); vi and σij represent the velocity and stress field, respectively; Fi denotes an external source force; p is the material density; cijkl are the stiffness tensors that describe the seismic medium;
  ∂      ∂    t  represents a time derivative; and
  ∂      ∂    j  represents a spatial derivative with respect to the j-th direction. A Cartesian coordinate system is used here, but other coordinate systems (e.g. polar, cylindrical or spherical coordinate systems) may be used, depending on the domain being simulated.
The wave equations are similar to many other physical systems that can be characterized by partial differential equations.
In computer simulations, the above partial differential equations are solved using finite difference methods, finite elements, spectral elements and many other methods. For simplicity, only the finite difference method is discussed below.
The finite difference method is an approach widely used in the oil and gas industry for large-scale seismic wave equation modeling in complex media. Common commercial applications include reference synthetic seismic data generation, reverse time migration and full waveform inversion. Commercial finite-difference modelers typically solve the acoustic or anisotropic pseudo-acoustic wave equation in schemes that apply a common kernel across the model domain. These schemes are usually highly optimized, and include load balancing, which is needed to account for the additional cost of absorbing boundary regions in some domains.
There are different ways to decompose a domain into subdomains such that the simulations of the subdomains can be handled by computer nodes in parallel. A computer node or a node, as used in this application, refers to a computational unit which may have one or more processors or processor cores, or even one or more computer blades. Load balancing is desirable among the nodes but not within a single node. This is because load balancing within a single node is trivial or because load balancing is not a concern. When the subdomains and the computer nodes are relatively homogenous, and the simulation load within each subdomain and the capability of each node are relatively uniform, the domain can be decomposed into uniform subdomains, each assigned to a computer node. A non-uniform decomposition may be achieved by weighting domain sizes within a regular, gridded decomposition scheme using simple heuristics.
When a system is more complicated, such as in many seismic systems where the effects of attenuation, anisotropy and elasticity are modelled in complex geological structures, the simulation can be many orders more complex than an acoustic system where the modeling requirements are homogeneous. The complexity of the system may make the load balancing more challenging for several reasons.
Firstly, there may not be enough computational capacity, even with today's supercomputer, to model an entire domain of a survey with a single unified computational scheme. Efficient or practical implementations may need a hybrid approach in which several different modeling kernels operate in different regions (or subdomains) of the model, possibly at different finite-difference grid resolutions. For example, in some parts of a survey domain, a simplified isotropic scheme with a coarse grid may be used, while in selected target volumes, an anisotropic scheme with a fine grid may be used. These regions are then coupled by applying suitable boundary conditions where they join. With this approach, the cost of finite-difference modeling varies significantly over the model domain in a manner that is difficult to account for using regular domain decomposition and heuristic predictive load balancing methods.
Secondly, computing hardware availability is leading to larger and increasingly heterogeneous cluster environments. This is partly because the pace of change in multi-core CPUs is rapid enough that clusters typically contain nodes spanning several generations of CPU technology, and also due to the emergence of mixed CPU and GPU clusters and hybrid cores. The non-uniformity of computing nodes makes load balancing more difficult to achieve.
Thirdly, the increase in the cost of seismic modeling is still outstripping the increase in available computing resources. It therefore becomes more important that finite-difference modeling jobs make the most efficient use of available cluster time.
It is desirable to find a better way to balance the computation load among available computer resources. Furthermore, it is desirable to find a way to adjust the distribution of the computation load among available computer resources if the load-balancing requirements change during the course of a computer simulation.