Full wavefield inversion is a nonlinear inversion technique that recovers the earth model by minimizing the mismatch between the simulated and the observed seismic wavefields. Due to the high computational cost associated with FWI, conventional implementations utilize local optimization techniques to estimate optimal model parameters. A widely used local optimization technique is the gradient-based first-order method, (e.g., steepest descent or nonlinear conjugate gradient), which utilizes only the gradient information of the objective function to define a search direction. Although a gradient-only first-order method is relatively efficient—it requires computing only the gradient of the objective function—its convergence is generally slow. The convergence of FWI can be improved significantly by using a second-order method. This improved convergence is achieved because second-order methods utilize both the gradient and curvature information of the objective function to determine an optimal search direction in model parameter space. (The search direction unit vector s is related to the model update process by mupdated=m+αs, where α (a scalar) is the step size.)
The major difference between first and second order methods is that second-order methods precondition the gradient with the inverse Hessian (e.g., Gauss-Newton/Newton method), or with the inverse of a projected Hessian (e.g., subspace method). The Hessian is a matrix of second-order partial derivatives of the objective function with respect to the model parameters. In general, second-order methods are attractive not only because of their relative fast convergence rate, but also because of the capability to balance the gradients of different parameter classes and provide meaningful updates for parameter classes with different data sensitivities (e.g., velocity, anisotropy, attenuation, etc.) in the context of multi-parameter inversion. In second-order methods, optimum scaling of parameter classes using the Hessian is crucial in multi-parameter inversion, if such parameter classes are to be simultaneously inverted. However, because it is very expensive to compute the inverse of the Hessian, this is a major obstacle for wide adoption of second-order methods in practice. Another disadvantage of second-order methods is that if the objective function is not quadratic or convex (e.g., where the initial model is far from the true model), the Hessian or its approximation may not accurately predict the shape of the objective function. Hence, the gradients for different parameter classes may not be properly scaled, thereby resulting in suboptimal search directions.