Geophysical inversion [1,2] attempts to find a model of subsurface properties that optimally explains observed data and satisfies geological and geophysical constraints. There are a large number of well known methods of geophysical inversion. These well known methods fall into one of two categories, iterative inversion and non-iterative inversion. The following are definitions of what is commonly meant by each of the two categories:
Non-iterative inversion—inversion that is accomplished by assuming some simple background model and updating the model based on the input data. This method does not use the updated model as input to another step of inversion. For the case of seismic data these methods are commonly referred to as imaging, migration, diffraction tomography or Born inversion.
Iterative inversion—inversion involving repetitious improvement of the subsurface properties model such that a model is found that satisfactorily explains the observed data. If the inversion converges, then the final model will better explain the observed data and will more closely approximate the actual subsurface properties. Iterative inversion usually produces a more accurate model than non-iterative inversion, but is much more expensive to compute.
Iterative inversion is generally preferred over non-iterative inversion, because it yields more accurate subsurface parameter models. Unfortunately, iterative inversion is so computationally expensive that it is impractical to apply it to many problems of interest. This high computational expense is the result of the fact that all inversion techniques require many compute intensive simulations. The compute time of any individual simulation is proportional to the number of sources to be inverted, and typically there are large numbers of sources in geophysical data. The problem is exacerbated for iterative inversion, because the number of simulations that must be computed is proportional to the number of iterations in the inversion, and the number of iterations required is typically on the order of hundreds to thousands.
The most commonly employed iterative inversion method employed in geophysics is cost function optimization. Cost function optimization involves iterative minimization or maximization of the value, with respect to the model M, of a cost function S(M) which is a measure of the misfit between the calculated and observed data (this is also sometimes referred to as the objective function), where the calculated data is simulated with a computer using the current geophysical properties model and the physics governing propagation of the source signal in a medium represented by a given geophysical properties model. The simulation computations may be done by any of several numerical methods including but not limited to finite difference, finite element or ray tracing. The simulation computations can be performed in either to frequency or time domain.
Cost function optimization methods are either local or global [3]. Global methods simply involve computing the cost function S(M) for a population of models {M1, M2, M3, . . . } and selecting a set of one or more models from that population that approximately minimize S(M). If further improvement is desired this new selected set of models can then be used as a basis to generate a new population of models that can be again tested relative to the cost function S(M). For global methods each model in the test population can be considered to be an iteration, or at a higher level each set of populations tested can be considered an iteration. Well known global inversion methods include Monte Carlo, simulated annealing, genetic and evolution algorithms.
Unfortunately global optimization methods typically converge extremely slowly and therefore most geophysical inversions are based on local cost function optimization. Algorithm 1 summarizes local cost function optimization.
Algorithm 1Algorithm for performing local cost function optimization.1.selecting a starting model,2.computing the gradient of the cost function S(M) with respectto the parameters that describe the model,3.searching for an updated model that is a perturbation of thestarting model in the negative gradient direction that betterexplains the observed data.
This procedure is iterated by using the new updated model as the starting model for another gradient search. The process continues until an updated model is found that satisfactorily explains the observed data. Commonly used local cost function inversion methods include gradient search, conjugate gradients and Newton's method. Next, this background information will be explained in somewhat more detail.
Local cost function optimization of seismic data in the acoustic approximation is a common geophysical inversion task, and is generally illustrative of other types of geophysical inversion. When inverting seismic data in the acoustic approximation the cost function can be written as:
                              S          ⁡                      (            M            )                          =                              ∑                          g              ⁢                                                          =              1                                      N              g                                ⁢                                    ∑                              r                =                1                                            N                r                                      ⁢                                          ∑                                  t                  =                  1                                                  N                  t                                            ⁢                              W                ⁡                                  (                                                                                    ψ                        calc                                            ⁡                                              (                                                  M                          ,                          r                          ,                          t                          ,                                                      w                            g                                                                          )                                                              -                                                                  ψ                        obs                                            ⁡                                              (                                                  r                          ,                          t                          ,                                                      w                            g                                                                          )                                                                              )                                                                                        (        1        )            where:    S=cost function,    M=vector of N parameters, (m1, m2, . . . mN) describing the subsurface model,    g=gather index,    wg=source function for gather g which is a function of spatial coordinates and time, for a point source this is a delta function of the spatial coordinates,    Ng=number of gathers,    r=receiver index within gather,    Nr=number of receivers in a gather,    t=time sample index within a trace,    Nt=number of time samples,    W=minimization criteria function (we usually choose W(x)=x2 which is the least squares (L2) criteria),    ψcalc=calculated seismic pressure data from the model M,    ψobs=measured seismic pressure data.
The gathers of seismic data in Equation 1 can be any type of gather that can be simulated in one run of a seismic forward modeling program. Usually the gathers correspond to a seismic shot, although the shots can be more general than point sources. For point sources the gather index g corresponds to the location of individual point sources. For plane wave sources g would correspond to different plane wave propagation directions. This generalized source data, ψobs, can either be acquired in the field or can be synthesized from data acquired using point sources. The calculated data ψcalc on the other hand can usually be computed directly by using a generalized source function when forward modeling. For many types of forward modeling, including finite difference modeling, the computation time needed for a generalized source is roughly equal to the computation time needed for a point source.
Equation 1 can be simplified to:
                              S          ⁡                      (            M            )                          =                              ∑                          g              =              1                                      N              g                                ⁢                      W            ⁡                          (                              δ                ⁡                                  (                                      M                    ,                                          w                      g                                                        )                                            )                                                          (        2        )            where the sum over receivers and time samples is now implied and,δ(M,wg)=ψcalc(M,wg)−ψobs(wg)  (3)
The object of inversion by cost function optimization is to attempt to update the model M such that S(M) is a minimum. This can be accomplished local cost function optimization which updates the given model M(k) as follows:M(k+1)=M(k)−α(k)∇MS(M)  (4)where k is the iteration number, α is the scalar size of the model update, and ∇MS(M) is the gradient of the misfit function, taken with respect to the model parameters. The model perturbations, or the values by which the model is updated, are calculated by multiplication of the gradient of the objective function with a step length α, which must be repeatedly calculated.
From Equation 2, the following equation can be derived for the gradient of the cost function:
                                          ∇            M                    ⁢                      S            ⁡                          (              M              )                                      =                              ∑                          g              =              1                                      N              g                                ⁢                                                    ∇                M                            ⁢                              W                ⁡                                  (                                      δ                    ⁡                                          (                                              M                        ,                                                  w                          g                                                                    )                                                        )                                                      .                                              (        5        )            So to compute the gradient of the cost function, it is necessary to separately compute the gradient of each gather's contribution to the cost function, then sum those contributions. Therefore, the computational effort required for computing ∇MS(M) is Ng times the compute effort required to determine the contribution of a single gather to the gradient. For geophysical problems Ng usually corresponds to the number of geophysical sources (each location of a source apparatus being considered a separate source) and is on the order of 10,000 to 100,000, greatly magnifying the cost of computing ∇MS(M).
It may be noted that computation of ∇MW(δ) requires computation of the derivative of W(δ) with respect to each of the N model parameters mi. Since for geophysical problems N is usually very large (the number of different parameters times the number of model grid cells where the parameters must be assigned values is usually more that one million), this computation can be extremely time consuming if it had to be performed for each individual model parameter.
What is needed is a more efficient method of computing the cost function gradient, without significant reduction in the accuracy of the local cost function optimization. The present invention satisfies this need.