During seismic, electromagnetic, or a similar survey of a subterranean region, geophysical data are acquired typically by positioning a source at a chosen shot location, and measuring seismic, electromagnetic, or another type of back-scattered energy generated by the source using receivers placed at selected locations. The measured reflections are referred to as a single “shot record”. Many shot records are measured during a survey by moving the source and receivers to different locations and repeating the aforementioned process. The survey can then be used to perform inversion, e.g., Full Waveform/Wavefield Inversion in the case of seismic data, which uses information contained in the shot records to determine physical properties of the subterranean region (e.g., speed of sound in the medium, density distribution, resistivity, etc.). Inversion is an iterative process, each iteration comprising the steps of forward modeling to create simulated (model) data and objective function computation to measure the similarity between simulated and field data. Physical properties of the subsurface are adjusted at each iteration to ensure progressively better agreement between simulated and field data. The invention will be described primarily in the context of Full Waveform Inversion of seismic data, but can be applied to inversion of other types of geophysical data.
Multi-parameter inversion involves simultaneous updating of at least two medium properties. A typical strategy is to formulate an objective (cost) function E(m) measuring the misfit between modeled and field data, where m is a vector of medium properties whose components can be compressional and shear-wave velocities, Vp and Vs, density ρ, Thompsen anisotropy parameters ϵ and δ (Tsvankin, 2001, p. 18), etc. The gradient of the objective function with respect to individual components of m is indicative of the direction in which medium parameters can be updated so that the objective function is minimized and progressively better fit of modeled and field data is obtained. The basis of this approach is the well-known Taylor series:
            E      ⁡              (                  m          +                      Δ            ⁢                                                  ⁢            m                          )              =                  E        ⁡                  (          m          )                    +                        (                                    ∇              m                        ⁢            E                    )                ⁢        Δ        ⁢                                  ⁢        m            +                        1          2                ⁢        Δ        ⁢                                  ⁢                              m            T                    ⁡                      (                                          ∇                mm                            ⁢              E                        )                          ⁢        Δ        ⁢                                  ⁢        m            +      …        ⁢          ,
where Δm is the desired update; ∇mE and ∇mmE are the gradient and the Hessian of the objective function respectively. The gradient ∇mE is a vector containing first-order derivatives of the objective function E with respect to each individual component mi of the model vector m:
            ∇      m        ⁢    E    =            [                        ∂          E                          ∂                      m            i                              ]        .  The Hessian ∇mmE is a matrix containing second-order derivatives of the objective function E with respect to individual components mi, mj:
            ∇      mm        ⁢    E    =            [                        ∂          E                                      ∂                          m              i                                ⁢                      ∂                          m              j                                          ]        .  Clearly, if we neglect quadratic terms (the ones with the Hessian) of this expansion and set Δm=−α∇mE, with α>0, then the objective function will decrease:E(m+Δm)=E(m)+(∇mE)Δm=E(m)−α(∇mE)2<E(m).Optimal α can be determined with the help of line search, which typically involves evaluating the objective (cost) function for strategically chosen values of α so as to find the best one.
The drawback of this approach is that the gradient does not usually provide the best possible descent direction. Different components of the gradient could be of vastly different magnitudes (especially, when they correspond to different types of medium properties, e.g., Vp and ϵ) and may exhibit leakage from one component to another due to interdependence of different medium parameters on one another.
A better descent direction can be obtained if the quadratic terms are taken into account. Various approaches of this type are called Newton's method, Newton-CG, and Gauss-Newton and are based on inverting the Hessian:Δm=−(∇mmE)−1∇mE. Due to its size (typically 109×109 in 3D), the Hessian has to be inverted iteratively, each iteration involving application of the Hessian to a vector. Depending on the problem, the Hessian-vector products (an equivalent term for application of the Hessian to a vector), can be computed analytically, numerically using finite differences, or using the adjoint state method (Heinkenschloss, 2008). Since only a few (usually 10-20) iterations of this iterative process can be afforded in practice, the resulting approximations to the inverse Hessian are usually not very accurate and may not be able to eliminate the leakage (cross-talk) between various medium parameters or provide the correct scaling between different components of the gradient. Moreover, the inversion algorithm may lead to accumulation of artifacts Δm, resulting in a suboptimal solution.
A cheaper way to ensure proper relative scaling of the gradient components is to apply the subspace method (Kennett et al., 1988.) The key idea behind this method is to represent the model perturbation as a sum of basis vectors:Δm=αs1+βs2+ . . .For example, for two different types of medium parameters (e.g., Vp and ϵ) a customary choice (Sambridge et al., 1991) is:
      Δ    ⁢                  ⁢          m      ~        =            α      ⁡              [                                                            Δ                ⁢                                                                  ⁢                                  m                  1                                                                                        0                                      ]              +          β      ⁢                          [                                    0                                                              Δ              ⁢                                                          ⁢                              m                2                                                        ]      where one typically sets Δm1˜(−∇m1E), Δm2˜(−∇m2E). Δ{tilde over (m)} denotes the updated (improved) model perturbation, as opposed to the original model perturbation
      E    ⁡          (              m        +                  Δ          ⁢                                          ⁢                      m            ~                              )        ≈            E      ⁡              (        m        )              +                  α        ⁡                  (                                    ∇                              m                1                                      ⁢            E                    )                    ⁢      Δ      ⁢                          ⁢              m        1              +                  β        ⁡                  (                                    ∇                              m                2                                      ⁢            E                    )                    ⁢      Δ      ⁢                          ⁢              m        2              +                                        1            2                    ⁡                      [                                                                                αΔ                    ⁢                                                                                  ⁢                                          m                      1                                                                                                            βΔ                    ⁢                                                                                  ⁢                                          m                      2                                                                                            ]                          ⁡                  [                                                                                          ∇                                                                  m                        1                                            ⁢                                              m                        1                                                                              ⁢                  E                                                                                                  ∇                                                                  m                        1                                            ⁢                                              m                        2                                                                              ⁢                  E                                                                                                                          ∇                                                                  m                        2                                            ⁢                                              m                        1                                                                              ⁢                  E                                                                                                  ∇                                                                  m                        2                                            ⁢                                              m                        2                                                                              ⁢                  E                                                              ]                    ⁡              [                                                            αΔ                ⁢                                                                  ⁢                                  m                  1                                                                                                        βΔ                ⁢                                                                  ⁢                                  m                  2                                                                    ]            Thus, each component of the gradient can be scaled independently so that the resulting search direction is improved. The scaling factors α and β are chosen so that the quadratic approximation to the objective function is minimized:
      Δ    ⁢                  ⁢    m    =            [                                                  Δ              ⁢                                                          ⁢                              m                1                                                                                        Δ              ⁢                                                          ⁢                              m                2                                                        ]        .  It is easy to show that the minimum of the objective function will be obtained if we set
      [                            α                                      β                      ]    =      -                                        [                                                                                Δ                    ⁢                                                                                  ⁢                                                                  m                        1                        T                                            ⁡                                              (                                                                              ∇                                                                                          m                                1                                                            ⁢                                                              m                                1                                                                                                              ⁢                          E                                                )                                                              ⁢                    Δ                    ⁢                                                                                  ⁢                                          m                      1                                                                                                            Δ                    ⁢                                                                                  ⁢                                                                  m                        1                        T                                            ⁡                                              (                                                                              ∇                                                                                          m                                1                                                            ⁢                                                              m                                2                                                                                                              ⁢                          E                                                )                                                              ⁢                    Δ                    ⁢                                                                                  ⁢                                          m                      2                                                                                                                                        Δ                    ⁢                                                                                  ⁢                                                                  m                        2                        T                                            ⁡                                              (                                                                              ∇                                                                                          m                                2                                                            ⁢                                                              m                                1                                                                                                              ⁢                          E                                                )                                                              ⁢                    Δ                    ⁢                                                                                  ⁢                                          m                      1                                                                                                            Δ                    ⁢                                                                                  ⁢                                                                  m                        2                        T                                            ⁡                                              (                                                                              ∇                                                                                          m                                2                                                            ⁢                                                              m                                2                                                                                                              ⁢                          E                                                )                                                              ⁢                    Δ                    ⁢                                                                                  ⁢                                          m                      2                                                                                            ]                                -            1                          ⁡                  [                                                                                          (                                                                  ∇                                                  m                          1                                                                    ⁢                      E                                        )                                    ⁢                  Δ                  ⁢                                                                          ⁢                                      m                    1                                                                                                                                            (                                                                  ∇                                                  m                          2                                                                    ⁢                      E                                        )                                    ⁢                  Δ                  ⁢                                                                          ⁢                                      m                    2                                                                                ]                    .      
The cost of determining the values of α and β (which provide the desired scaling of the gradient components) is equal to two applications of the Hessian to a vector (Δm1 and Δm2), making this method far cheaper than Newton/Newton-CG/Gauss-Newton.
However, the limitation is that the leakage (cross-talk) cannot be handled effectively, since all the subspace method does is scale each component of the gradient up or down (by α and β).