Inversion [see, for example, Tarantola, 1984] attempts to find a model that optimally explains observed data. Local inversion methods that minimize the value of an objective function that measures the difference between simulated and observed data, are often the only practical means of solving an inversion problem for a model having a large number of free parameters. These local methods require an initial guess for the model to be inverted. They then iteratively update the model to make it closer to the true solution by searching for a perturbation of the current model in a direction based on the gradient of the objective function. Unfortunately, the objective function often has many minima, not just one minimum corresponding to the solution model. These other minima are called local minima, while the minimum corresponding to the desired solution is termed the global minimum. If the starting model for inversion is too close to the model corresponding to one of these local minima, then local inversion methods will get stuck near that local minimum and never iterate away from it to the global minimum. Thus, the wrong solution is produced no matter how much effort is expended on iteration.
This local minima problem can be solved by first iterating inversion on an altered objective function that has fewer local minima, but has a global minimum near the location of the desired solution. The result of iterating on this altered objective function should produce a model closer to the desired solution. This more accurate model is then used as the initial model for iteration on the original objective function. Since this new initial model is close to the global minimum of the original objective function, iteration on the original objective function should now produce an accurate solution. This technique of iterating on an altered objective function is often termed multi-resolution, or multi-grid, or multi-scale inversion, which is discussed further below.
There are a large number of well known methods of inversion. These methods fall into one of two categories, iterative inversion and non-iterative inversion. The following are definitions of what is commonly meant by each of the two categories:                Non-iterative inversion—inversion that is accomplished by assuming some simple background model and updating the model based on the input data. This method does not use the updated model as input to another step of inversion. For the case of seismic data these methods are commonly referred to as imaging, migration, diffraction tomography, linear inversion, or Born inversion.        Iterative inversion—inversion involving repetitious improvement of the subsurface properties model such that a model is found that satisfactorily explains the observed data. If the inversion converges, then the final model will better explain the observed data and will more closely approximate the actual subsurface properties. Iterative inversion usually produces a more accurate model than non-iterative inversion, but is much more expensive to compute.        
Wave inversion means any inversion based on a wave simulator, such as acoustic or seismic inversion. The iterative method most commonly employed in wave inversion is objective function optimization. Objective function optimization involves iterative minimization of the value, with respect to the model M, of an objective function S(M) which is a measure of the misfit between the calculated and observed data (this is also sometimes referred to as the cost function). The calculated data are simulated with a computer programmed to use the physics governing propagation of the source signal in a medium represented by the current model. The simulation computations may be done by any of several numerical methods including but not limited to finite differences, finite elements or ray tracing. Following Tarantola [Tarantola, 1984], the most commonly employed objective function is the least squares objective function:S(M)=(u(M)−d)TC−1(u(M)−d),  (1)where T represent the vector transpose operator and:    M=the model which is a vector of N parameters [m1, m2, . . . , mN]T,    d=measured data vector (sampled with respect to source, receiver and time),    u(M)=simulated data vector for model M (sampled with respect to source, receiver and time),    C=the covariance matrix.Objective function optimization methods are either local or global [Fallat, et al., 1999]. Global methods simply involve computing the objective function S(M) for a population of models {M1, M2, M3, . . . } and selecting a set of one or more models from that population that approximately minimize S(M). If further improvement is desired this newly selected set of models can then be used as a basis to generate a new population of models that can be again tested relative to the objective function S(M). Global methods are more likely to converge to the correct solution than local methods, but are too expensive to apply to large scale inversion problems having many model parameters. Well known global inversion methods include Monte Carlo, simulated annealing, genetic and evolution algorithms.
Local objective function optimization involves:
Algorithm 1 - Algorithm updating a model using local objectivefunction optimization.1.Set the current model to be the starting model,2.Compute the gradient, ∇ MS(M), of the objective function withrespect to the parameters that describe the model,3.Search for an updated model, that is a perturbation of the startingmodel in a direction based on the gradient, that betterexplains the observed data.4.If updated model is not sufficiently accurate then return to step2 using the updated model as the current model, otherwiseterminate.Local inversion methods are much more efficient than global methods, and are therefore the only practical methods to apply to a large scale inversion problem. Commonly used local objective function inversion methods include steepest descent, conjugate gradients and Newton's method.
It should be noted that computation of ∇MS(M), the second step Algorithm 1, requires computation of the derivative of S(M) with respect to each of the N model parameters mi. When N is very large (roughly more than a thousand), this computation can be extremely time consuming if it had to be performed for each individual model parameter. Fortunately, the adjoint method can be used to efficiently perform this computation for all model parameters at once [Tarantola, 1984]. The adjoint method for the least squares objective function and a gridded model parameterization (M is a vector with each element representing the model's value in a grid cell) is summarized by the following algorithm:
1.Compute forward simulation of the data using the current model, M(k) with k being the current iteration, to get u(M(k)),2.Subtract the observed data from the simulated data giving δ(M(k)),3.Compute the reverse simulation (i.e. backwards in time) using C−1 δ(M(k)) as the source, producing uadjoint(M(k)),4.Finally, ∇ MS(M(k)) = uadjointT(M(k)) A u(M(k)), where A represents the adjoint operator (e.g. identity for gradients with respect to components of M representing bulk modulus, or spatial gradient for gradients with respect to components of M representing density).Algorithm 2—Algorithm for computing the least-squares cost-function gradient of a gridded model using the adjoint method.
Local objective function optimization is generally much less expensive than global objective function optimization, but requires a more accurate starting model. This more accurate starting model is required, because the objective function often has many minima and local optimization methods will generally find the closest of these minima. The minimum corresponding to the true model is called the global minimum and all other minima are termed local minima. If the starting model is not nearest to the global minimum then a local optimization technique will likely yield an inaccurate inverted model that corresponds to the closest local minimum. This is illustrated in FIG. 1 where the objective is to invert for a model M which has two parameters m1 and m2. The dashed contours 110 display the values of the objective function as a function of the parameters m1 and m2. The global minimum 120 is marked by a solid black circle and two local minima 130 and 140 are shown by gray filled circles. The inversion starts at initial model M(0) (150) and proceeds by local optimization to the iteration one model M(1) and so forth to model M(3) (160). No matter how many more iterations of local optimization are attempted the inverted model will only get closer to the local minimum 130 near M(3), rather than approximating the global minimum 120.
Several methods have been proposed that attempt to overcome this local minima problem. As mentioned above, many of these methods involve iterating on an altered objective function during the early iterations of the inversion. This altered objective function is chosen to have fewer local minima, but to have a global minimum near the original objective function's global minimum. By this means, the early iterations will produce a model that though inaccurate, is significantly closer to the original objective function's global minimum. FIG. 2 illustrates a local optimization corresponding to FIG. 1 but using an altered objective function having fewer local minima. The altered objective function has a global minimum 210 (the solid black circle) that is close to but not at the same location as the global minimum 220 for the original objective function (the cross-hatched circle). Starting from the initial model M(0) (230), which is the same as the initial model used in FIG. 1, two iterations using the altered objective function results in a model M(2) (240). This model M(2) can then be used as the initial model for further iterations but now using the original objective function. This is illustrated in FIG. 3 where the iteration 2 model from FIG. 2 (shown in gray), renumbered 310, is used as the starting model. Iteration now converges to a model M(4) (320) near the global minimum 220 rather than near a local minimum as in FIG. 1. Because the starting model is more accurate than the original starting model, the inversion now iterates to the correct solution.
Typically when altering the original objective function, the number of local minima in the altered objective function is inversely related to the distance between the global minima of the original and that of the altered objective function. Thus, it can be advantageous to iterate on a sequence of altered objective functions starting with one having the fewest number of local minima and least accurate global minimum, proceeding through objective functions that have increasing numbers of local minima and increasing accuracy of the global minim, then ending by iterating on the original objective function. Methods that perform initial iterations on altered objective functions having few local minima are often referred to as multi-scale or multi-grid methods, and a flow chart for this technique is illustrated in FIG. 4.
The process starts at step 410 by choosing an alteration of the original objective function to optimize. This altered objective function, which depends on the data to be fit 420, is iterated at step 430 until the altered objective function is found to be sufficiently minimized at step 440. (The value is less than a selected maximum or another stopping condition is met.) When that occurs, it is determined at step 450 whether the current inverted model sufficiently minimizes the original objective function. If not, the process returns to step 410 and either choose a new altered objective function or the original objective function to optimize. Eventually, the process ends (460).
Two altered objective functions have been proposed in the literature for solving the local minima problem in seismic full wave field inversion (“FWI”):                High cut filters—Bunks (Bunks et al., 1995) describes altering the least squares objective function by applying time invariant high-cut filters (sometimes called a low-pass filter, meaning a filter that passes frequencies below its cutoff frequency and rejects frequencies above it) to both the measured data and the source signature used for computing the simulated seismic data. The high-cut, i.e. cutoff, frequency of these filters is then increased as the inversion is iterated, with no filter being applied for the final iterations (no filter corresponding to the original objective function). It is well known how to design such filters; see, for example, Press et al., Numerical Recipes in FORTRAN, The Art of Scientific Computing, Cambridge University Press (1992). They may also be obtained from sources such as Seismic Un*x (see http://www.cwp.mines.edu/cwpcodes/).        Layer stripping—Maharramov (Maharramov et al., 2007) teaches that initial iterations of inversion should be localized to shallow layers, and that this depth range be extended as iteration proceeds. Correspondingly, when only shallow depths are inverted, then only shorter times in the data are inverted, because a shallow model can only predict the shorter time portions of the data.In general FWI will converge to the global minimum if the starting model is accurate enough to predict traveltimes for any propagation mode to within a half period of that mode. This may be called Pratt's criterion. (“In practice, with seismic waveform inversion this implies that much of the waveform energy must be predicted (by the initial model) to within a half-wavelength of the observed waveforms; if not, a minimum misfit model will be obtained when the predicted waveforms match the wrong cycle in the observed waveforms.”—Pratt, 1999).        
The present invention is an improved method for reducing the accuracy requirements on the starting model when performing local objective function optimization.