Many types of physical processes, including fluid flow in a petroleum reservoir, are governed by partial differential equations. These partial differential equations, which can be very complex, are often solved using finite difference, finite volume, or finite element methods. All of these methods divide the physical model into units called gridblocks, cells, or elements. In each of these physical units the solution is given by one or more solution variables or unknowns. Associated with each physical unit is a set of equations governing the behavior of these unknowns, with the number of equations being equal to the number of unknowns. These equations also contain unknowns from neighboring physical units.
Thus, there is a structure to the equations, with the equations for a given physical unit containing unknowns from that physical unit and from its neighbors. This is more conveniently depicted using a combination of nodes and connections, where a node is depicted by a small circle and a connection is depicted by a line between two nodes. The equations at a node contain the unknowns at that node and at the neighboring nodes to which it is connected.
The equations at all nodes are assembled into a single matrix equation. Often the critical task in obtaining the desired solution to the partial differential equation is solving this matrix equation. One of the most effective ways to do this is through the use of incomplete LU factorization or ILU, in which the original matrix is approximately decomposed to the product of two matrices L and U. The matrices L and U are lower triangular and upper triangular and have similar non-zero structures as the lower and upper parts of the original matrix, respectively. With this decomposition, the solution is obtained iteratively by forward and backward substitutions.
There is an ongoing need to obtain better solution accuracy. One way to do this is to divide the physical model into smaller physical units, or in other words to use more nodes, perhaps millions of them. Of course, the time needed to perform the computations increases as this is done. One way to avoid this time increase is to perform the computations in parallel on multiple processors.
There are two types of parallel computers, those using shared memory and those using distributed memory. Shared memory computers use only a handful of processors, which limits the potential reduction in run time. Distributed memory computers using tens of processors are common, while some exist that use thousands of processors. It is desired to use distributed memory parallel processing. In either case, the number of processors is typically an even number in parallel computing.
When using distributed memory, the computations are parallelized by decomposing the physical model into domains, with the number of domains being equal to the number of processors to be used simultaneously. Each domain is assigned to a particular processor, which performs the computations associated with that domain. Each domain contains a specified set of nodes, and each node is placed in a domain.
There are at least two main factors affecting the parallel simulator performance when decomposing the domain for a given reservoir model: the number of active simulation cells in each decomposed domain and grid coloring. The challenge is keeping the number of active simulation cells balanced while not causing grid coloring issues. An active cell represents a unitary space in the reservoir model with oil, gas and/or water. An inactive cell therefore, represents a unitary space in the reservoir model without oil, gas and/or water.
Conventional applications like Nexus®, which is a commercial software application offered by Landmark Graphics Corporation, do not take the number of active cells into account during 2D domain decomposition and typically divide the reservoir model into equal parts between the domains, regardless of the number of active simulation cells in each domain. In this approach, depending on the reservoir geometry, domains might end up getting a very disproportionate number of active cells causing poor reservoir simulation performance.
Parallelism in Nexus® is based upon a domain-decomposition approach; the entire model is subdivided or decomposed into collections of cells called grids. When Nexus® is run in parallel, a group of Nexus® processes is started with a specified number of processes, and each of the grids is assigned to one of these processes. The assignment can be specified by the user, or it can be determined by Nexus®. Each process performs the model computations upon the grids which were assigned to it. Some of these computations require data from grids assigned to other processes when a cell has one or more of its immediately adjacent neighbors.
The performance of Nexus®, when run in parallel, is affected significantly by the decomposition of the reservoir model cells into grids and the manner in which the grids are assigned to processes. Because the reservoir model grids are coupled through a pressure equation, data from adjacent grids affects the solution within a grid. Several times within every timestep, information needs to be passed from each grid to all grids physically connected to it. The assignment of grids to processes must account for the physical connections between cells in the model—this is one reason why a typical decomposition often looks like a collection of building blocks: the inter-grid communications are minimized.