The search for subsurface hydrocarbon deposits typically involves a multifaceted sequence of data acquisition, analysis, and interpretation procedures. The data acquisition phase involves use of an energy source to generate signals which propagate into the earth and reflect from various subsurface geologic structures. The reflected signals are recorded by a multitude of receivers on or near the surface of the earth, or in an overlying body of water. The received signals, which are often referred to as seismic traces, consist of amplitudes of acoustic energy which vary as a function of time, receiver position, and source position and, most importantly, vary as a function of the physical properties of the structures from which the signals reflect. The data analyst uses these traces along with a geophysical model to develop an image of the subsurface geologic structures.
The analysis phase involves procedures which vary depending on the nature of the geological structure being investigated, and on the characteristics of the dataset itself. In general, however, the purpose of a typical seismic data processing effort is to produce an image of the geologic structure from the recorded data. That image is developed using theoretical and empirical models of the manner in which the signals are transmitted into the earth, attenuated by the subsurface strata, and reflected from the geologic structures. The quality of the final product of the data processing sequence is heavily dependent on the accuracy of these analysis procedures.
The final phase is the interpretation of the analytic results. Specifically, the interpreter's task is to assess the extent to which subsurface hydrocarbon deposits are present, thereby aiding such decisions as whether additional exploratory drilling is warranted or what an optimum hydrocarbon recovery scenario may be. In that assessment, the interpretation of the image involves a variety of different efforts. For example, the interpreter often studies the imaged results to obtain an understanding of the regional subsurface geology. This may involve marking main structural features, such as faults, synclines and anticlines. Thereafter, a preliminary contouring of horizons may be performed. A subsequent step of continuously tracking horizons across the various vertical sections, with correlations of the interpreted faults, may also occur. As is clearly understood in the art, the quality and accuracy of the results of the data analysis step of the seismic sequence has a significant impact on the accuracy and usefulness of the results of this interpretation phase.
In principle, the seismic image can be developed using a three-dimensional geophysical model of seismic wave propagation, thereby facilitating accurate depth and azimuthal scaling of all reflections in the data. Accurately specified reflections greatly simplify data interpretation, since the interpretational focus can be on the nature of the geologic structure involved and not on the accuracy of the image. Unfortunately, three dimensional geophysical models frequently require intolerably long computation times, and seismic analysts are forced to simplify the data processing effort as much as possible to reduce the burdens of both analysis time and cost.
In addition to the three dimensional computation challenge, the analyst faces a processing volume challenge. For example, a typical data acquisition exercise may involve hundreds to hundreds of thousands of source locations, with each source location having hundreds of receiver locations. Because each source-receiver pair may make a valuable contribution to the desired output image, the data handling load (i.e. the input/output data transfer demand) can be a burden in itself, independent of the computation burden.
Seismic data analysts have historically used several different approaches to directly or indirectly manage these burdens. These approaches relate principally to either the manner in which the data acquisition exercise is designed and carried out, or to the assumptions made during the data analysis effort. In both cases, the quality of the output of the data interpretation procedure may be directly affected. These approaches are most easily discussed in conjunction with FIG. 1, which depicts a perspective view of a region 20 of the earth for which a geophysical image is desired. On surface 18 of the earth are shown a number of shot lines 2 along which the seismic data are acquired. As shown in FIG. 1A, shot lines 2 consist of a sequence of positions at which a seismic source 3 is placed and from which seismic signals 5 are transmitted into the earth. Receivers 4 placed along each line receive the signals from each source position after reflection from various subsurface reflectors 6.
In FIG. 1, region 20 is a three-dimensional volume of the earth. It is not necessary that region 20 have any particular shape (e.g., a cube or a parallelepiped). However, for ease of reference, all such three-dimensional volumes will be hereinafter referred to as "cubic."
A first method of managing the seismic data burdens discussed above involves careful definition of the region over which the data acquisition exercise is carried out. Specifically, use of any available preliminary geologic and geophysical information may facilitate the minimization of the surface area over which seismic data may need to be acquired. Such a minimization will directly reduce the amount of data that is ultimately acquired. Furthermore, similarly careful planning of the spacing between shot lines will optimize the analysis effort by reducing data volume. And finally, optimization of the number of sources and receivers which are used, and of the spacing between adjacent source and receiver positions, will also benefit the data analyst.
None of these efforts can be accomplished without a penalty however. For example, relatively wide spacings between shot lines, or between sources and receivers, reduce the resolution of the computed seismic image, thus making interpretation more difficult. In addition, complex geologic features may not be resolvable without relatively close spacings. And finally, certain data acquisition exercises, such as in relatively unexplored areas, do not allow optimization of the surface area over which data is to be acquired. As a result, the data handling burden cannot be entirely eliminated through data acquisition planning.
Methods of minimizing the computational burden are often implemented during data analysis. One commonly invoked technique involves use of a two-dimensional geophysical model. For example, in FIG. 1A, the signals for each source is depicted as traveling in the plane directly beneath the shotline on which the source lies. Thus, the signal is assumed to propagate independent of out-of-plane geologic structures. This simplifying assumption allows use of two dimensional geophysical models in the image generation process, and, as is well known, two dimensional analysis procedures can be much more computationally efficient than three dimensional analysis procedures.
Limitations to the two dimensional analysis assumption exist however. Geologic structures are rarely, if ever, two dimensional; that assumption may therefore lead to inaccurately specified images. Because little is generally known of the geologic structure being investigated, the analyst usually does not know the extent to which that image is in error. In addition, because each plane is analyzed independently, the interpreter must tie the images for each plane to each other by interpolation or other similar interpretative methods if a continuous image across the entire cubic region is desired. And finally, some complex structures, such as faulted regions and salt features, cannot be accurately analyzed merely by use of two dimensional methods.
Because of these and other limits which have long constrained seismic data analysts, the petroleum industry has typically been an early user of newly developed high speed computer hardware. As each new generation of equipment has become available, analysis routines which implement fully three dimensional analysis capabilities have become more commonly used. Nevertheless, it is not uncommon for significant computer times to be involved in complex analyses, often involving weeks or months of actual processing time.
The recent availability of massively parallel processors offers a significant opportunity to seismic data analysts. Massively parallel processors (MPPs) have multiple central processing units (CPUs) which can perform simultaneous computations. By efficient use of these CPUs, the weeks or months previously required for complex analyses can be reduced to a few days, or perhaps a few hours. However, this significant advantage can only be realized if efficient computational algorithms are encoded in the MPP software. Thus, the opportunity MPPs offer seismic data analysts also creates a challenge for the development of suitable computational algorithms which take advantage of the multiple CPUs.
This challenge can be easily discussed by considering the manner in which computational algorithms have most commonly been written for existing seismic analysis routines. Until recently, computers relied on a mode of operation referred to as sequential computing. Sequential computing involves use of analytic routines which perform only a single procedure, or perhaps focus on a single subset of the data or image, at any given time. This is a direct result of a computer having only one CPU. For that reason, the only optimization procedures which can be employed on single CPU computers are those which increase the efficiency of the processing as to the procedure or subset. Because all calculations must ultimately be performed by that single CPU, however, the options for obtaining high performance are innately limited.
On the other hand, the multiple CPU capability of MPPs offers an obvious simultaneous computation advantage. This advantage is that the total time required to solve a computational problem can be reduced by subdividing the work to be done among the various CPUs, provided that the subdivision allows each CPU to perform useful work while the other CPUs are also performing work. Unfortunately, the disadvantage of multiple CPU hardware is that the sequential processing methods which have long been used in software development must be replaced by more appropriate parallelized computing methods. Simply stated, MPPs require that processing methods be developed which make efficient use of the multiple CPU hardware. Ideally, these methods should organize the distribution of work relatively evenly among the processors, and ensure all processors are performing necessary computations all of the time, rather than awaiting intermediate results from other processors.
The challenge of defining parallelized processing methods, and of optimizing those parallelized methods once defined, is particularly acute in the seismic data processing arena. Seismic data consists generally of a large number of individual traces, each recorded somewhat independent of the other traces. Logically enough, sequential computing methods which require the analytic focus to be placed on a single calculation at a time adapt well to analysis of these independent traces. This is true even though computational bottlenecks may exist. For example, portions of the analytic sequence may require relatively more computation time than other portions, must be completed before other calculations may proceed, or may rely on similar input data as other traces, for example traveltimes. Since no simultaneous computations occur in sequential processing, none of these bottlenecks lead to a reduction in computational efficiency with a single CPU, except as to the total processing time which is required. And except as to that total time requirement, the existence of such computational bottlenecks does not otherwise pose problems for the analyst. To take full advantage of MPP computing capabilities, however, where the goal is to perform simultaneous processing in all CPUs, methods for optimizing the seismic analysis phase by eliminating such bottlenecks must be developed.
This advantage of an MPP becomes clear by considering the limitation which calculation time places on image region size in single CPU computers. As the size of the image increases, for example by expanding the size of cube 20 in FIG. 1, or the amount of data to be processed increases, for example by adding additional sources 3 and receivers 4 to shotlines 2, the total computation time is directly lengthened by the number of additional individual calculations. That direct impact on calculation time places a heavy burden on seismic analysts to optimize image size, especially since even small image regions may require weeks of computation time on even the highest speed sequential processing computers. In contrast, efficient processing on MPPs, which may have as many as or more than 256 individual CPUs, should only involve minimally lengthened computation times, since each CPU would assume just a fraction, for example 1/256, of the additional work required by the larger region. This potential for scalability of the image region and the work load required in image generation is a principal benefit of MPPs, a benefit which can only be realized if parallelized seismic processing methods allowing such workload scalability are developed.
Basic considerations for determining efficient parallelized seismic processing methods become evident by reconsidering the above review of the seismic analysis process. As noted, the purpose of seismic analysis is to analyze measured seismic data using geophysical models to develop images of the subsurface. Therefore, each of three principal processing components--data, model, and image--may be considered to be candidates for distributing computational work among the various processors in an MPP. One option for distributing work among the processors would be to assign different groups of the input seismic trace data to different processors. For example, traces may be grouped by source locations, with different processors being assigned different groups. Similarly, the output image could be subdivided and assigned to different processors. In this case, the groupings and processor assignments would be, for example, as to different surface positions in the output image or depth locations of reflectors of interest. Finally, it may also be possible to subdivide the geophysical model used to generate the output image into groupings which can be assigned to the various processors (That model is generally considered to be embodied in the arithmetic operations required by the mathematical model which is the subject of the processing effort. For example, in seismic analysis the mathematical model is often based on the wave equation). For example, the data may be transformed into the frequency domain, with individual frequencies assigned to individual processors. It may also be possible to develop combinations of these approaches. For example, groups of processors may be assigned collective responsibility for specific frequencies in the model and all depths in the image, while having individual responsibility for specific horizontal locations in the image. The challenge to the seismic data analyst is to determine methods of subdividing the seismic data, model, and image into components which can be assigned to individual processors in the MPP, thus allowing calculations to be performed in each processor independently of other processors. This subdivision of seismic data analysis into individual components is commonly referred to as seismic decomposition.
Efficient use of an MPP requires more than mere subdividing of the seismic data, model and image components however. A number of additional considerations must be taken into account if the resulting parallel processing method is to efficiently use all CPUs in an MPP. These additional considerations relate specifically to the unique calculations involved in seismic analysis. For example, as demonstrated in FIG. 1A, a seismic dataset consists of a large number of seismic signals 5, with each signal being associated with a specific source and receiver pair. As is well known in the art, the sequence of received signals at each receiver represents the time history of reflections from a possibly large number of subsurface reflectors. Processing of all portions of all traces for development of any specific portion of the output image may not be time or cost effective, however, since all traces will not in general contribute to all portions of the image. As a result, seismic analysts generally specify a fixed region surrounding the portion of the image being computed. Only the traces within that region, referred to as the aperture, are used to compute the image. To ensure that all important data is retained for the image computation, the aperture is usually conservatively specified to have relatively wide dimensions and is invariant with depth within the imaging region and with source-receiver offset.
Unfortunately, this conservative approach leads to inefficient use of computer resources. For example, the propagation path for a given source-receiver pair and a specific reflector may be such that the received signal occurs at a late arrival time. Alternatively, the slope of the reflector may be such that the signal-to-noise ratio of the received signal is insufficient to contribute to the generation of the portion of the image of principal interest. Finally, computation of an image from some combinations of all available data traces may lead to a well known imaging problem referred to as aliasing. For these reasons, an efficient method of processing seismic data on an MPP should include a technique for determining the seismic aperture, such that those traces which are not useful to the image can be discarded before significant processor effort has been expended. Ideally, the MPP processing aperture should at least be a function of depth and source-receiver offset.
An additional consideration in seismic analysis relates to computation and storage of traveltime or velocity fields, one or the other of which is a required input to the geophysical model which is used to compute an image from the data. In general, seismic analysis routines calculate these fields for the entire region from which data was acquired, without a determination as to whether the data corresponding to specific portions of the fields will affect the image. Furthermore, because large computer memory resources would be required to retain the calculated fields, seismic analysts frequently recompute the fields several times on an as needed basis during the image generation process. Although this procedure minimizes memory usage, the penalty is a slowing down of the image generation process due to the frequent recomputations. As a result of that penalty, seismic analysts often minimize the size of the region to be imaged, so as to minimize field recomputation needs during imaging. For these reasons, an efficient method of processing seismic data on an MPP should minimize the memory storage requirements of the traveltime or velocity fields, while also allowing efficient recomputations of the necessary values by each CPU only as and when required.
Prior art methods of processing seismic data are also limited as to simultaneous computation abilities. Specifically, analysts generally prefer to sort the seismic data prior to processing, for example, into common offsets. Thereafter, the processing procedures are carried out. Because it is preferable to perform all processing for each common offset separately from the processing of subsequent offsets, prior art methods focus on a single common offset at a time, and perform all necessary data input, storage, calculations, and recalculations as to that offset prior to initiating subsequent offset calculations. However, this approach is an inefficient use of computer resources, because such input data as traveltimes are thereby required to be re-input for each offset. For that reason, a method or processing seismic data allowing simultaneous processing of multiple common offsets (i.e. two to four minimum) is desired. Such a method would reduce overall processing time and cost by reducing repetitive input/output (I/O) demand, and thereby increasing analysis efficiency.
Another constraint on the design of efficient parallelized methods of processing seismic data relates to the manner and frequency with which communication between CPUs and from CPUs to peripheral devices, such as disks and tape drives, is carried out. Although the goal is to have the multiple CPUs operating simultaneously, it is clear that except in a few unusual cases some amount of data sharing or computation synchronization may be necessary. Regardless of the method by which the seismic data, model or image are subdivided and assigned to individual CPUs, if the subdivision and assignment method requires a high level of communication among the MPP system's components, whether CPUs or peripheral devices, the MPP's performance will be limited. Therefore, an efficient method of processing seismic data on a parallel processor should maximize the computations that can be performed with a minimum amount of intercomponent communication. And to the extent that some intercomponent communication will always be required, again except in a few unusual cases, an efficient method of processing seismic data on an MPP will ensure that individual components are not forced to spend a significant amount of time waiting on delivery of the data or the instructions required for the next computation to be carried out.
From the foregoing, it can be seen that an improved method of processing seismic data on parallel processors is required. Preferably, the method should allow the computations in the seismic analysis phase to be decomposed into components which can be assigned to the individual processors in the MPP, with the seismic data contributing to the processing objective of each component relatively independently from the other components. The decomposition method and processor assignment scheme should be easily scaleable to datasets of varying size, and should optionally allow the computational workload to be balanced among all processors, if such a balance is desired. The method should include the capability to vary the size of the analysis aperture, and provide for minimized traveltime and velocity field recomputation and storage requirements. Finally, the method should involve a relatively balanced input/output data transfer requirement for each component, with such transfers substantially independent of transfers involving other components. The present invention satisfies these needs.