In many areas of science and engineering, complex physical problems are solved using mathematical models which are discretized over space. Various methods can be used to convert the continuous mathematical model into a discretized form, the most common being the Finite Difference, Finite Element and Finite Volume methods.
The finite volume method is commonly used in Computational Fluid Dynamics applications, and in related fields such as Oil Reservoir Simulation. This method splits a problem domain into grid blocks (cells) by superimposing a mesh or grid of some type. Fluid conservation equations are then constructed by considering the accumulation of fluids within a cell, and flow across the cell faces to neighboring cells, or to external sinks or sources, such as oil wells. Mass accumulation and flow rates depend on fluid properties such as the pressure, temperature and composition in each grid block and its neighbors. For time varying problems, the model is advanced through time by finding the values of the fluid properties that simultaneously satisfy the conservation equations in every cell for each discrete time-step.
In general these equations are non-linear, and Newton's method (or similar) is used to reduce the problem to the repeated solution of simultaneous linear equations linking all the grid blocks.
Because fluids flow only across the common faces of neighboring cells, and not between distant cells having no common face, the matrix representing the linear equation coefficients is sparse, and has a distinctive structure. For one-dimensional problems, the matrix is tridiagonal. For two- and three-dimensional problems, the matrix has a distinctive nested block tridiagonal structure (see FIGS. 1 and 2 of the accompanying drawings); the ordering of grid blocks that leads to this structure will be discussed further below.
Similar considerations apply to other space discretization methods: the end result is the requirement to solve sparse linear equations, which, with appropriate ordering of nodes, have the distinctive nested block tridiagonal structure described above.
Many common methods for iterative solution of the linear equations depend on the use of a “preconditioner”—a fast procedure for obtaining an approximate solution to the linear equations. The NF (Nested Factorization) algorithm is a preconditioner which, unlike most others, is specifically designed to approximate the nested block tridiagonal matrices which arise from space discretization. In its original form, NF can be applied to topologically rectangular meshes, but not to tetrahedral, and other more general meshes.
One practical application is in the area of oil reservoir simulation. Oil and gas are produced from porous underground rock formations containing both water and hydrocarbons. Fluids, including oil and gas are extracted through wells, and wells are also used to inject fluids, such as water, steam, carbon dioxide or chemicals with a view to improving overall production—for example by pushing oil towards a well, or making sticky oil more mobile.
Engineers use oil reservoir simulators to understand what is happening in the reservoir and to explore alternative strategies for optimizing outcomes. The results of a simulation may determine when and where new wells are drilled, and how they are controlled.
Modeling of a reservoir typically proceeds through two phases—history matching (see FIG. 12) and prediction (see FIG. 13). In the history matching phase, the past production of a field and its wells is repeatedly modeled with variations to the geological model designed to improve the match between historical fact and simulation. Once an adequate match is obtained, prediction runs can be started. These runs explore the consequences of alternative operating plans, often extending for several decades into the future. After the chosen plan is put into operation, the model will be re-run from time to time to tune the match, and refine the predictions.
The operation of a typical reservoir simulator is summarized in FIG. 14, and described below:                1. The first step is to read data describing the reservoir model defined by the reservoir engineer. Typically this comprises:—                    Geological data from numerous sources, including seismic analysis, rock cores and well log analysis. The rock porosity and directional permeabilities are key variables, and often vary greatly across the reservoir. The location and characteristics of geological faults must also be specified.            Details of the computational grid; fine grids give better results, but increase computation time.            Fluid properties, such as viscosity, density and phase transition information. Relative permeabilities are used to characterize the mobility of different phases when mixtures are present. Fluid properties also vary spatially.            Sufficient information to determine the initial state of the main solution variables; these variables will include pressure, and probably the saturations of oil, water and gas. Other variables may represent the hydrocarbon composition and temperature. There may be from two up to 20 or more independent variables in each grid block. The simulator will model the changes to these variables for every grid block through a series of discrete time-steps. This solution is an n-vector, referred to here as x. n is usually the number of grid blocks multiplied by the number of solution variables per grid block.            Additional data specifying detailed control and reporting options.                        2. Next the engineer's proposed schedule of operation is read in. This consists principally of the location and characteristics of all production and injection wells, and details of how they will be operated. For example a well may be opened months or years after the start of the simulation, and its production or injection rates determined partly by conditions in the reservoir or constraints of surface facilities.        3. The simulator begins to loop over time steps which will advance the simulation through the schedule of operations. When the end of the schedule is reached, the simulator produces some final summary data and halts.        4. The simulator selects a time-step by which to advance. This may be anything from a few seconds to a year or more, depending on what is required by the schedule, and by numerical considerations.        5. The simulator makes an estimate (dx) of the change in the solution variables between the start and end of the time-step. dx, like x, is an n-vector.        6. The simulator now begins a second loop to solve the non-linear equations which describe mass conservation in each grid block.        7. Calculate the residuals - which reflect the degree to which material is not conserved for each grid block. For a given grid block, the residuals are found by calculating the amount of all fluids in the cell at the beginning and end of the time step. This difference must be made up by flows from neighboring grid blocks and wells. Both quantities (fluid in place and flow in or out) depend on the solution variables (pressure, composition, temperature etc.) for the cell and its neighbors (x and x+dx).        8. If material does not balance, then the residuals are non-zero, and a solution must continue to be sought.        9. If the residuals are sufficiently close to zero, update the solution variablesx=x+dx                    and return to step 4                        10. If the iteration has not converged, but the maximum permitted number of iterations has been reached, the current iteration is abandoned, and a solution is instead sought to a simpler problem. A shorter time step is selected, and processing returns to step 5.        11. If the non-linear equations have not converged, the simulator assembles a matrix (A) of derivatives using the current best estimate of the solution. Each matrix coefficient represents the rate of variation of a residual (local material balance error) with respect to a solution variable. The resulting matrix equationA.ddx=r                    (where A is an nxn matrix, ddx is an n-vector of changes to the solution variables, and r is an n-vector of residuals) is a linearized version of the non-linear equations. The solution procedure is just an n-dimensional matrix form of Newton's method for finding the zeros of a one-dimensional function.                        12. Solve the linear equation formed in step 11. This is often the most time consuming step in the procedure.        13. Update the current estimate of the change to the solution variables during the time stepdx=dx+ddx                    and return to step 7                        
Oil reservoir simulators are extremely demanding applications which constantly push the boundaries of what computers can do. Engineers have always demanded more sophisticated physics (more variables per grid block) and finer meshes (more grid blocks) both of which trends increase computational requirements. In addition, convergence of the linear and non-linear equations to solution is not a forgone conclusion. Many models devised by engineers simply don't work because the simulator cannot find a solution.
The most time consuming part of a simulation is the solution of the linear equations (step 12 above). Direct methods, such as Gaussian elimination, cannot be used because they would require far too much memory, and take far too long. Most simulators now use an iterative method based on GMRES, or a similar Krylov subspace method, with a preconditioner to accelerate convergence. Reference 1 (a list of numbered references is included at the end of the description) contains an excellent summary of these methods. The preconditioner (B) is an approximation to A chosen such that B−1.x can be calculated quickly for an arbitrary vector x. The choice of preconditioner is critical: with a poor choice, the iteration may be very slow, or may not converge at all. In extreme cases the simulator will fail, even with the best known solution methods.
FIG. 15 shows how a preconditioner fits into the operation of a Krylov subspace iterative solver such as GMRES (in FIG. 15, forward-references to FIGS. 16 to 18 should not be interpreted as implying that the content of FIGS. 16 to 18 relate generally to the NF method).                1. The first step is to define the ordering of the matrix variables. For some preconditioners, the order is unimportant, but for NF it is critical. This step is discussed in detail below.        2. Next the preconditioner is factored into block upper and lower factors, B=L.U. All the elements of L and U are calculated. In most practical preconditioners, including NF, L and U retain most of the sparsity of A, and most non-zeros are the same as the corresponding terms in A. In contrast, an exact factorization of A would have few non-zeros. Once this is done, the preconditioner can be applied easily to arbitrary vectors during the iteration phase starting at step 4.        3. An initial estimate of the solution (x0) is generated, or supplied by the calling program. Often x0 is set to 0. The initial residual (r0=b−Ax0) is calculated.        4. The iteration counter, m is set to 1.        5. The preconditioner is used to find an approximate solution to the equation A.qm=rm−1qm=B−1.rm−1                     If B were exactly equal to A, qm would take us to the solution.                        6. Use the matrix multiply routine to form the vector zm zm=A.qm         7. Find the scalar values α1 . . . αm which minimize the norm of rm where        
      r    m    =            r      0        +                  ∑                  i          =          1                m            ⁢                        α          i                ·                  z          i                                    8. If rm is below the convergence target then exit with the solution        
      x    m    =            x      0        +                  ∑                  i          =          1                m            ⁢                        α          i                ·                  q          i                                    9. If the iteration limit has been reached, then exit reporting that the method has failed        10. Increment the iteration counter, m and return to step 5.        
The Nested Factorization (NF) Algorithm is a known pre-conditioning method which may be used to accelerate the convergence of the conjugate gradient, GMRES, OrthoMin and other similar algorithms for iterative solution of large sparse sets of linear equations. The algorithm is described in references (2) to (4), and summarized at http://www.polyhedron.co.uk/nf-Nested Factorization0htm. This original NF algorithm assumes that the matrix is formed by considering connections between neighboring blocks in a simple topologically rectangular grid.
For an nx by ny by nz grid, the banded coefficient matrix A isA=d+u+l+v+m+w+nwhere                d is the diagonal,        u and l are the bands immediately above and below the diagonal, which connect adjacent cells within a line        v and m are the bands nx elements away from the diagonal, which connect adjacent lines within a plane        w and n are the bands nx*ny elements away from the diagonal, which connect adjacent planes        
For example, given a 4×4×2 grid (FIG. 1), the matrix has the structure shown in FIG. 2, based on an ordering of cells as shown in FIG. 1.
This matrix can be regarded as recursively block tridiagonal. At the outermost level each block element represents a plane, or the coefficients connecting adjacent planes. Within each plane, the blocks represent lines, and coefficients connecting adjacent lines. At the innermost level, there are individual grid blocks and the connections between individual grid blocks within lines.
NF exploits this recursive structure by defining the preconditioner, B, recursively:—
B = (I + n · P−1) · (P + w)(B is block tridiagonal -block elements are planes)whereIis the unit matrixandP = (I + m · T−1) · (T + v)(P is block tridiagonal -block elements are lines)andT = d + I + u − approx(ε)(T is tridiagonal)andε = m · T−1 · v + n · P−1 · w(the uncorrected error matrix)
approx(ε) is a matrix which is computed to approximate ε. It is usually diagonal, but could be tridiagonal. Common approximations are:—    diag(ε) the diagonal elements of ε    rowsum(ε) the diagonal matrix formed by summing the elements of ε in rows    colsum(ε) The diagonal matrix formed by summing the elements of ε in columns.
After expansion, the following is obtained:—B=A+ε−approx(ε)
In summary, B is a partial L.U factorization, but unlike most other similar methods, the factors L and U are block (rather than strictly) lower and upper triangular. For the example 4×4×2 finite difference matrix, (I+n.P−1) and (P+w) are 2×2 matrices
IPawn · Pa−1IPb
where Pa and Pb are themselves matrices which connect cells within the 2 planes, and n, w are diagonal matrices containing elements from A. To compute B−1.x, where x is a vector, first sweep forward through the planes to obtain (I+n.P−1)−1.x, and then back to apply (P+w)−1. The problem is thus reduced to solving P−1.x for each plane, and from three dimensions to two.
For a suitable choice of P (P=A−n−w−n.P−1.w), this factorization of A is exact. However an exact factorization is not practical for large matrices, and instead, each block element of P is approximated in the same way—with an incomplete block L.U factorization.Pa=(I+m.T−1).(T+v)
For the example matrix, (I+m.T−1) and (T+v) are 4×4 matrices
ITavama Ta−1ITbvbmb Tb−1ITcvcmc Tc−1ITd
where Ta, Tb, Tc and Td are themselves 4×4 matrices which connect cells within a line, and m, v are diagonal matrices containing elements from A. The evaluation of Pa−1.x proceeds in the same way as that of B−1.x, and the problem is further reduced from 2 dimensions to 1. The solution of the one-dimensional problems, which are of the form Ta−1.x (where Ta is tridiagonal) is then a simple matter which terminates the recursion. There are three levels of recursion because the computational grid occupies a three-dimensional space.
In fact, the NF preconditioner can be shown to be equivalent to a strictly triangular factorization, but with L and U factors that include a good subset of the fill-in bands produced by a Cholesky factorization. However the computational work and storage required by NF is much less than would be required to calculate the fill-ins. This explains why NF performs better than ILU methods.
The NF algorithm was designed around rectangular grid geometries, and the innermost loops involve the solution of tridiagonal matrices representing lines of grid blocks. In practice, NF has been extended from this simple case in a number of ways, for example:—                Extension to coupled “multi-phase” problems with several independent variables (e.g. pressure, temperature and composition) in each grid block. In this case individual matrix elements are small (e.g. 4×4) dense matrices rather than scalars, but the algorithm is otherwise unchanged.        Accommodation of “non-neighbor” connections that can be used to simulate some more complex geometries, such as local grid refinements        Accommodation of “nine-point” finite difference terms (non-zero terms connecting diagonally adjacent grid blocks)        “Multi-Color” orderings designed to allow efficient execution on parallel computers.        Extra levels of recursion to deal with some classes of problem which can be thought of as four or five dimensional (multiple porosity models).        Inclusion of higher order error correction terms on the l and u terms in the tridiagonal.        Reference 5 describes a method for applying NF to 2D unstructured meshes and 3D prismatic meshes. The method retains the assumption of tridiagonal inner blocks.        
In every case, the basic operation in the extended versions of NF remains as the solution of a tridiagonal matrix (or block tridiagonal in the multi-phase case) representing a strictly one-dimensional line of grid blocks. This restriction limits the application of NF in a number of ways:—                There is limited scope for “higher order” versions which produce a more accurate preconditioner at the cost of some additional computation. In contrast, the family of ILU factorizations achieve this by allowing more “fill-in”, so that the factorizations in the series ILU0, ILU1, ILU2 etc. are each more accurate than the one before.        “Non-neighbor” connections break the method if they occur between grid-blocks in the same line. For example, in a model representing the surface of a sphere, the natural inner element is a circle rather than a line of blocks. Another example occurs in oil reservoir simulation, where a well passing through several grid blocks may introduce additional connections between non-adjacent cells.        NF cannot be used for less regular geometries, for example the tetrahedral or irregular meshes common in CFD and structural analysis.        