The search for subsurface hydrocarbon deposits typically involves a sequence of data acquisition, analysis, and interpretation procedures. The data acquisition phase involves use of an energy source to generate signals that propagate into the earth and reflect from various subsurface geologic structures. The reflected signals are recorded by a multitude of receivers on or near the surface of the earth, or in an overlying body of water. The received signals, which are often referred to as seismic traces, consist of amplitudes of acoustic energy which vary as a function of time, receiver position, and source position and, most importantly, vary as a function of the physical properties of the structures from which the signals reflect. The data analyst uses these traces along with a geophysical model to develop an image of the subsurface geologic structures.
The analysis phase involves procedures that vary depending on the nature of the geological structure being investigated, and on the characteristics of the dataset itself. In general, however, the purpose of a typical seismic data processing effort is to produce an image of the geologic structure from the recorded data. That image is developed using theoretical and empirical models of the manner in which the signals are transmitted into the earth, attenuated by the subsurface strata, and reflected from the geologic structures. The quality of the final product of the data processing sequence is heavily dependent on the accuracy of these analysis procedures.
The final phase is the interpretation of the analytic results. Specifically, the interpreter's task is to assess the extent to which subsurface hydrocarbon deposits are present, thereby aiding such decisions as whether additional exploratory drilling is warranted or what an optimum hydrocarbon recovery scenario may be. In that assessment, the interpretation of the image involves a variety of different efforts. For example, the interpreter often studies the imaged results to obtain an understanding of the regional subsurface geology. This may involve marking main structural features, such as faults, synclines and anticlines. Thereafter, a preliminary contouring of horizons may be performed. A subsequent step of continuously tracking horizons across the various vertical sections, with correlations of the interpreted faults, may also occur. As is clearly understood in the art, the quality and accuracy of the results of the data analysis step of the seismic sequence have a significant impact on the accuracy and usefulness of the results of this interpretation phase.
In principle, the seismic image can be developed using a three-dimensional geophysical model of seismic wave propagation, thereby facilitating accurate depth and azimuthal scaling of all reflections in the data. Accurately specified reflections greatly simplify data interpretation, since the interpretational focus can be on the nature of the geologic structure involved and not on the accuracy of the image. Unfortunately, three dimensional geophysical models frequently require intolerably long computation times, and seismic analysts are forced to simplify the data processing effort as much as possible to reduce the burdens of both analysis time and cost.
In addition to the 3-D computation challenge, the analyst faces a processing volume challenge. For example, a typical data acquisition exercise may involve hundreds to hundreds of thousands of source locations, with each source location having hundreds of receiver locations. Because each source-receiver pair may make a valuable contribution to the desired output image, the data handling load (i.e., the input/output data transfer demand) can be a burden in itself, independent of the computation burden.
Seismic data analysts have historically used several different approaches to manage these burdens, directly or indirectly. These approaches relate principally to either the manner in which the data acquisition exercise is designed and carried out, or to the assumptions made during the data analysis effort. In both cases, the quality of the output of the data interpretation procedure may be directly affected. These approaches are most easily discussed in conjunction with FIG. 1, which depicts a perspective view of a region 20 of the earth for which a geophysical image is desired. On the surface 18 of the earth are shown a number of shot lines 2 along which the seismic data are acquired. As shown in FIG. 1A, shot lines 2 consist of a sequence of positions at which a seismic source 3 is placed and from which seismic signals 5 are transmitted into the earth. Receivers 4 placed along each line receive the signals from each source position after reflection from various subsurface reflectors 6.
A first method of managing the seismic data burdens discussed above involves careful definition of the region over which the data are acquired. Specifically, use of any available preliminary geologic and geophysical information may facilitate the minimization of the surface area over which seismic data may need to be acquired. Such a minimization will directly reduce the amount of data that is ultimately acquired. Furthermore, similarly careful planning of the spacing between shot lines will optimize the analysis effort by reducing data volume. And finally, optimization of the number of sources and receivers that are used, and of the spacing between adjacent source and receiver positions, will also benefit the data analyst.
None of these efforts can be accomplished without a penalty. For example, relatively wide spacing between shot lines, or between sources and receivers, reduce the resolution of the computed seismic image, thus making interpretation more difficult. In addition, complex geologic features may not be resolvable without relatively close spacing. And finally, certain data acquisition exercises, such as in relatively unexplored areas, do not allow optimization of the surface area over which data is to be acquired. As a result, the data handling burden cannot be entirely eliminated through data acquisition planning.
Methods of minimizing the computational burden are often implemented during data analysis. One commonly invoked technique involves use of a two-dimensional geophysical model. For example, in FIG. 1A, the signals for each source are depicted as traveling in the plane directly beneath the shotline on which the source lies. Thus, the signal is assumed to propagate independent of out-of-plane geologic structures. This simplifying assumption allows use of two dimensional geophysical models in the image generation process, and, as is well known, two dimensional analysis procedures can be much more computationally efficient than three dimensional analysis procedures.
Limitations to the 2-D analysis assumption exist. Geologic structures are rarely, if ever, two dimensional; that assumption may therefore lead to inaccurately specified images. Because little is generally known of the geologic structure being investigated, the analyst usually does not know the extent to which that image is in error. In addition, because each plane is analyzed independently, the interpreter must tie the images for each plane to each of the others by interpolation or other similar interpretative methods if a continuous image across the entire cubic region is desired. Finally, some complex structures, such as faulted regions and salt features, cannot be accurately analyzed merely by use of two dimensional methods.
Because of these and other limits that have long constrained seismic data analysts, the petroleum industry has typically been an early user of newly developed high speed computer hardware. As each new generation of equipment has become available, analysis routines that implement fully three dimensional analysis capabilities have become more commonly used. Nevertheless, it is not uncommon for significant computer times to be involved in complex analyses, often involving weeks or months of actual processing time.
The recent availability of massively parallel processors offers a significant opportunity to seismic data analysts. Massively parallel processors (MPPs) have multiple central processing units (CPUs) which can perform simultaneous computations. By efficient use of these CPUs, the weeks or months previously required for complex analyses can be reduced to a few days, or perhaps a few hours. However, this significant advantage can only be realized if efficient computational algorithms are encoded in the MPP software. Thus, the opportunity MPPs offer seismic data analysts also creates a challenge for the development of suitable computational algorithms that take advantage of the multiple CPUs.
This challenge can be easily discussed by considering the manner in which computational algorithms have most commonly been written for existing seismic analysis routines. Until recently, computers relied on a mode of operation referred to as sequential computing. Sequential computing involves use of analytic routines that perform only a single procedure, or perhaps focus on a single subset of the data or image, at any given time. This is a direct result of a computer having only one CPU. For that reason, the only optimization procedures that can be employed on single CPU computers are those which increase the efficiency of the processing as to the procedure or subset. Because all calculations must ultimately be performed by that single CPU, however, the options for obtaining high performance are innately limited.
On the other hand, the multiple CPU capability of MPPs offers an obvious simultaneous computation advantage. This advantage is that the total time required to solve a computational problem can be reduced by subdividing the work to be done among the various CPUs, provided that the subdivision allows each CPU to perform useful work while the other CPUs are also performing work. Unfortunately, the disadvantage of multiple CPU hardware is that the sequential processing methods that have long been used in software development must be replaced by more appropriate parallelized computing methods. Simply stated, MPPs require that processing methods be developed which make efficient use of the multiple CPU hardware. Ideally, these methods should organize the distribution of work relatively evenly among the processors, and ensure that all processors are performing necessary computations all of the time, rather than awaiting intermediate results from other processors.
The challenge of defining parallelized processing methods, and of optimizing those parallelized methods once defined, is particularly acute in the seismic data processing arena. Seismic data consists generally of a large number of individual traces, each recorded somewhat independently of the other traces. Logically enough, sequential computing methods that require the analytic focus to be placed on a single calculation at a time adapt well to analysis of these independent traces. This is true even though computational bottlenecks may exist. For example, portions of the analytic sequence may require relatively more computation time than other portions, must be completed before other calculations may proceed, or may rely on similar input data as other traces, for example traveltimes. Since no simultaneous computations occur in sequential processing, none of these bottlenecks lead to a reduction in computational efficiency with a single CPU, except as to the total processing time that is required. Except as to that total time requirement, the existence of such computational bottlenecks does not otherwise pose problems for the analyst. To take fall advantage of MPP computing capabilities, however, where the goal is to perform simultaneous processing in all CPUs, methods for optimizing the seismic analysis phase by eliminating such bottlenecks must be developed.
This advantage of an MPP becomes clear by considering the limitation which calculation time places on image region size in single CPU computers. Increasing the size of the image, e.g., by expanding the size of cube 20 in FIG. 1, or increasing the amount of data to be processed, e.g., by adding additional sources 3 and receivers 4 to shotlines 2, increase the total computation. That direct impact on calculation time places a heavy burden on seismic analysts to optimize image size, especially since even small image regions may require weeks of computation time on even the highest speed sequential processing computers. In contrast, efficient processing on MPPs, which may have as many as or more than 256 individual CPUs, should only involve minimally lengthened computation times, since each CPU would assume just a fraction, for example 1/256, of the additional work required by the larger region. This potential for scalability of the image region and the work load required in image generation is a principal benefit of MPPs, a benefit that can only be realized if parallelized seismic processing methods allowing such workload scalability are developed.
Basic considerations for determining efficient parallelized seismic processing methods become evident by reconsidering the above review of the seismic analysis process. As noted, the purpose of seismic analysis is to analyze measured seismic data using geophysical models to develop images of the subsurface. Therefore, each of three principal processing components--data, model, and image--may be considered to be a candidate for distributing computational work among the various processors in an MPP. One option for distributing work among the processors would be to assign different groups of the input seismic trace data to different processors. For example, traces may be grouped by source locations, with different processors being assigned different groups. Similarly, the output image could be subdivided and assigned to different processors. Finally, it may also be possible to subdivide the geophysical model used to generate the output image into groupings that can be assigned to the various processors. (That model is generally considered to be embodied in the arithmetic operations required by the mathematical model that is the subject of the processing effort. For example, in seismic analysis the mathematical model is often based on the wave equation). For example, the data may be transformed into the frequency domain, with individual frequencies assigned to individual processors. It may also be possible to develop combinations of these approaches. For example, groups of processors may be assigned collective responsibility for specific frequencies in the model and all depths in the image, while having individual responsibility for specific horizontal locations in the image. The challenge to the seismic data analyst is to determine methods of subdividing the seismic data, model, and image into components that can be assigned to individual processors in the MPP, thus allowing calculations to be performed in each processor independently of other processors. This subdivision of seismic data analysis into individual components is commonly referred to as seismic decomposition.
One type of MPP has from thousands to tens of thousands of relatively unsophisticated processing elements. In this kind of machine, the processing elements typically perform the same operation on multiple data streams, a Single Instruction, Multiple Data stream (SIMD) machine. An example is the CM2, a product of the Thinking Machines Corporation. These kinds of machines typically lack shared memory, i.e., each processor has its own separate memory unit and the information in the memory cannot be directly accessed by other processors. The individual processors typically have limited computing capability and memory. Because of the large number of processing elements and a lack of shared memory, data transfer between the processing elements is a major bottleneck in efficient utilization of the capability of the machines. Even with sophisticated interconnection techniques, such as in a hypercube arrangement, transfer of data between processors is a major factor in the running time of programs.
Other computers have much more powerful elements in arrays of tens or hundreds. The T3D, a product of Cray Research Corporation, is an example of this kind of machine. Besides having individual processing elements that are much more powerful than those in the CM2, the T3D has fewer of the elements and a physically distributed, logically shared memory. This Multiple Instruction Multiple Data stream (MIMD) machine has different elements performing different operations on different parts of the data at the same time. The reduced number of processing elements means that data does not have to be transferred to as many elements as in a SIMD machine. Because of the increased sophistication and cost of the individual elements and because of their fewer numbers, efficient utilization requires that the load on the processing elements be balanced. An additional factor is that each processing element must accommodate a larger subset of the overall data volume; computations that involve sorting of the data could become more complicated.
U.S. Pat. No. 5,404,296 issued to Moorhead discloses and claims a method for migration on an MPP in which there are a large number of processing elements arranged in a preselected, regular pattern. The data are initially shot-sorted and partitioned into blocks of shots so that the product of the number of receiver positions in the data set and the number of shots in a block equals the total number of processing elements available. Because 3-D seismic data volumes typically contain hundreds of shot and receiver positions, the disclosure and claims are limited to SIMD machines.
One approach that reduces the amount of the 3-D processing is to image a partially processed data set. This is particularly useful when the partial processing operation is computationally simpler than a complete imaging process. For example, prior to migration, the data could be stacked using a conventional Normal Moveout (NMO) velocity analysis. As will be familiar to those knowledgeable in the art, algorithms that perform migration of zero-offset data are computationally less intensive than those that migrate unstacked data. The stacking operation generally improves the signal to noise ratio of the data compared to that of the individual traces. To the extent that the stacked data accurately represents a hypothetical zero-offset trace, the computational burden is reduced.
In U.S. Pat. No. 5,349,527, Pieprzak and Highnam disclose a method for migration of such a data set on a SIMD MPP. The method starts with a 3-D data volume consisting of a time series (vertical axis) of reflection seismic data on a regular rectangular grid (horizontal axes) on the surface of the earth [(t, x, y) data domain]. A Fourier transformation is performed on the reflection seismic data to give a 3-D volume consisting of amplitudes as a function of frequency (vertical axis) on the rectangular surface grid [(.omega., x, y) data domain, where .omega. is the temporal frequency]. The data are partitioned into frequency sub-bands, and within each frequency sub-band, at selected frequencies, data are partitioned into subsets of the rectangular surface grid and assigned to individual processors. This is necessitated by the limited memory capability of the individual processors. The limitations of the individual processors also necessitate a hypercube arrangement of the processors in order to reduce the time involved in data transfer between the processors.
In Pieprzak and Highnam's invention, the subsequent migration is accomplished by an iterative, two-step process. The first step is the downward continuation of a single frequency component over one depth interval. The contributions of all the frequencies are then summed to produce a migrated image at the bottom of the depth interval. The two steps are repeated for successive depth intervals until the maximum depth is reached.
The downward continuation is performed using the McClellan transform method given in Hale, D., 3-D Depth Migration via McClellan Transformations, 56 Geophysics 1778 (1991). The actual implementation is done by means of a convolution filter. As will be known to those familiar with the art, the Hale method has difficulty handling steep dips and the computational burden becomes very heavy at dips greater than 60.degree.. To reduce the computational burden, Pieprzak and Highnam teach the use of a recursive Chebyshev filter for the convolution step. The coefficients of the filter are precomputed from the velocity model of the subsurface and stored redundantly in the local memory of the processing elements. Such an approach is appropriate in SIMD computers where the processing elements need to keep track of only small subgrid of (x, y) points and do not have the capability of computing the filter coefficients "on the fly."
Those knowledgeable in the art recognize that for large problems, frequency domain methods can offer significant computation cost savings. Furthermore, the prior art two-pass migration methods are computationally fast because they perform the migration in the x- and y-direction separately. Pieprzak and Highnam use a one-pass frequency-space domain migration process that is computationally more accurate than the two-pass migration process used in prior art. However, the increased accuracy of the one-pass method comes at the cost of vastly increased computational load. One option for overcoming that increased computational load is to employ frequency-wavenumber migration, which can be significantly faster than frequency-space migration. However, a frequency-wavenumber scheme based on Pieprzak and Highnam's disclosure would be impractical for the reason that the transformation from the spatial domain to the wavenumber domain becomes expensive when each processor has access to only a small portion of the x-y grid.
U.S. Statutory Invention Registration H 482 issued to Berryhill, Gonzalez and Kim in 1988 discloses a method for recursive time migration in the frequency-wavenumber domain. That method assumes that the subsurface of the earth can be modeled by layers in which the velocity is constant. However, the method is inefficient to implement on an MPP because it requires successive Fourier transformations in different directions on a data volume entirely within processor memory, and the geophysical data sets of interest to industry cannot be stored entirely within processor memory.
It would therefore be desirable to have a method that is able to perform a one-pass migration in the frequency-wavenumber domain on an MPP in a cost-effective manner. The present invention satisfies this need.