Seismic Acquisition & Processing
The Earth's subsurface can be imaged by a seismic survey, therefore, seismic data acquisition and processing are key components in geophysical exploration. In a seismic survey, elastic acoustic waves are generated by a source at the Earth's surface and the waves are radiated into the Earth's subsurface. For land seismic surveys, the usual source is dynamite or a seismic vibrator, while for a marine seismic survey the source is typically an airgun array.
As the waves radiate downward through the Earth's subsurface, they reflect and propagate upwards towards the surface whenever the subsurface medium changes. The upward reflections are detected by a number of receivers and the reflected data recorded and processed in order to image the subsurface. Interpretation of these acoustic images of the subsurface formation leads to the structural description of the subsurface geological features, such as faults, salt domes, anticlines, or other features indicative of hydrocarbon traps.
While two dimensional ("2D") seismic surveys have been conducted since the 1920's, three dimensional ("3D") seismic surveys have only recently become widely used. 3D surveys more accurately reflect the subsurface positions of the hydrocarbon traps, but are expensive and time consuming to acquire and process. For an offshore 3D data set covering a 20.times.20 km area, it costs about $3M dollars (1991 dollars) to acquire the data with another $1M dollars for data processing to transform the raw data into usable images. Because the cost of such a seismic survey is considerably less than the cost of drilling an offshore oil well, 3D seismic surveys are often worth the investment.
One common type of seismic survey is a marine survey, performed by boats in offshore waters. To record seismic data, a boat tows airguns (seismic sources) near its stern, and an up to 5 km long "streamer" containing hydrophones (seismic receivers) along its length. As the boat sails forward, it fires one source and receives a series of echoes into each seismic receiver. For each source-receiver pair, one prestack seismic trace is created. Each trace records sound waves that echo from abrupt acoustic impedance changes in rock beneath the ocean floor. Also recorded in a prestack trace, in a header section of the trace record, is information about the location of the source and receiver. See, K. M. Barry, D. A. Cavers, and C. W. Kneale. 1975. Recommended Standards for digital tape formats. Geophysics, 40, 344-352. Reprinted in Digital Tape Standards, Society of Exploration Geophysicists. 1980. Prestack traces are not associated with any particular area of the survey. Each echo that appears in a prestack trace is caused by a reflector that lies somewhere along, and tangent to, an elliptical path whose foci are the seismic source and receiver.
The spatial relationship between sources and receivers in a land seismic acquisition scenario differs from that described above; however, the present invention is unaffected by this.
A seismic survey is performed over a bounded region of the earth. This region is generally, but not necessarily precisely, rectangular. The survey area is partitioned into an array of bins. "Binning" is the assignment of traces to a survey array-usually a 12.5 by 25 meter rectangle. Any particular bin is located by its Cartesian coordinates in this array (i.e., by its row and column number). The ultimate output of the seismic survey is data that shows the location and strength of seismic reflectors in each bin, as a function of depth or time. This information cannot be deduced directly, but rather must be computed by applying numerous data processing steps to data recorded.
Although 3D marine surveys vary widely in size (1,000 to 100,000 km.sup.2), a typical marine survey might generate in excess of 40,000 data acquisition tapes. Data is accumulated at a staggering rate, about 1.5 million data samples every 10 seconds. A significant amount of time and money is spent in processing this enormous amount of data. The result of the seismic survey is thus an enormous amount of raw data indicative of reflected signals which are a function of travel time, propagation, and reflection effects. The goal is to present the reflected amplitudes as a function of lateral position and depth.
A typical marine seismic survey goes through three distinct sequential stages-data acquisition, data processing, and data interpretation. Data processing is by far the most time consuming process of the three. The acquisition time for a medium to large 3D marine seismic survey is in the order of two months. In addition to seismic data, navigation information is also recorded for accurate positioning of the sources and receivers. The resulting digital data must be rendered suitable for interpretation purposes by processing the data at an onshore processing center. The processing sequence can be divided into the following five processing steps.
1. Quality Control, filtering and deconvolution. This processing is applied on a trace basis to filter noise, sharpen the recorded response, suppress multiple echoes, and generally improve the signal-to-noise ratio. Most of these signal processing operations can be highly vectorized. PA1 2. Velocity analyses for migration. This processing estimates the velocity of the sub-surface formations from the recorded data by modeling the propagation of acoustic waves with estimated velocities and checking for signal coherence in the acquired data. It is similar to migration but is applied to a small section of the data cube. PA1 3. 3D dip moveout correction and stacking. This processing step, generally the most input/output intensive part of the processing, (i) sums together several traces in order to eliminate redundancy and increase the signal-to-noise ratio, (ii) corrects for time delays that occur when the reflected signal is recorded by successive hydrophones that are located increasingly farther away from the energy source, and (iii) positions and orients the stacked data in accordance with the navigation information. After this processing step, the data is referred to as stacked data. This step normally constitutes on the order of a 100 to 1 reduction in data volume. PA1 4. Migration. This processing step, computationally the most intensive, relocates the position of reflected strata, that are recorded in time, to their correct position in depth. PA1 5. Enhancement and filtering. This processing step is used to enhance the migrated data using digital filtering techniques.
The stacking process (step 3) reduces the amount of data to what is essentially a three dimensional array of numbers (i.e. a data cube) representing amplitudes of reflected seismic waves recorded over a period of time (usually 8 seconds). Such data cubes can be large, for example, a medium size 3D survey may produce cubes as large as 1000.times.1000.times.2000 of floating-point numbers.
The stacked data cube represents a surface recording of acoustic echoes returned from the earth interior and is not usually directly interpretable. The migration (or acoustic imaging process, step 4) is used to convert stacked data into an image or a map which can then be viewed as a true depth map cut out of the survey area.
Thus, migration is one of the most critical and most time consuming components in seismic processing is migration. Generally speaking, migration transforms the seismic data recorded as a function of time into data positioned as a function of depth using preliminary knowledge of the propagation velocities of the subsurface. In particular, migration moves dipping reflectors to their true subsurface position. Migration is typically performed on post stack seismic data to reduce the amount of processing time, but even so takes weeks of conventional supercomputer time for even medium size post stack seismic data cubes.
Many types of stacking and migration processes are well known. See, O. Yilmaz. 1987. Seismic Data Processing. Tulsa, Okla.: Society of Exploration Geophysicists. Usually, one poststack trace is associated with each bin. However, it is also possible to create multiple poststack traces per bin. For example, each such trace might contain contributions from prestack traces whose source-receiver separation falls within a specific range. (In this case, the bin is said to contain a common depth-point or common midpoint gather.)
Stacking programs create poststack data from prestack data by simple manipulation of prestack data. In general, a stacking program transforms each prestack trace exactly once. Migration programs create poststack data from prestack data by more complicated, computationally intensive, manipulation of the same data. Migration programs transform each prestack trace a large number of times, requiring commensurately more computation than simpler stacking programs. Multiple prestack traces are transformed and added together and superimposed to create the one or more poststack traces associated with a bin.
One such (partial) migration program is "3D Dip Moveout" (DMO) using the Kirchhoff method. See, U.S. Pat. No. 5,198,979 Moorhead et al. See also, S. Deregowski and F. Rocca. 1981. Geometrical optics and wave theory for constant-offset sections in layered media. Geophysical Prospecting, 29, 374-387. DMO creates a poststack data set than can be input to another (full) migration program. DMO transforms a prestack trace once for each bin that lies under a line drawn between the seismic source and receiver. This line is referred to as the "coverage" of the trace. The coverage of a trace in another migration program, 3D Kirchhoff prestack depth migration, is substantially larger than that in a DMO program.
Though efficient, the Kirchhoff approach is still computationally expensive. Approximately 30 arithmetic operations (floating-point operations, or FLOPs) are required for each sample of each transformed trace. Given an average shot-receiver separation of 3 kilometers, a bin width of 12.5 meters, and 8 seconds worth of data in each trace acquired at 4 ms/sample, this implies an average of approximately 14.4 million FLOPs per trace. A typical 20 km square marine survey using 12.5 meter wide, 25 meter tall bins contains perhaps 80 million prestack traces. The DMO process using the Kirchhoff approach thus consumes approximately 10 trillion FLOPs. This computational expense motivates the implementation of migration programs such as DMO on some form of high-performance supercomputer, such as a massively parallel processor (MPP) See, Thinking Machines Corporation, 1993. The Connection Machine CM-5 Technical Summary. Such a processor is an attractive platform upon which to execute migration programs, because its performance scales up as its size increases; thus, the system can grow incrementally as the computation demand of the processing organization increases. See also, W. Daniel Hillis and Lewis W. Tucker, The CM-5 Connection Machine: A Scalable Supercomputer, Communications of the ACM, November 1993, Vol. 36, No. 11, pp 31-40.
Parallel Computation
As shown in FIG. 1, an MPP 10 consists of 3 major components: (i) a disk storage system 12 whose capacity and data transfer rate can be scaled up as storage and data throughput requirements demand, (ii) a data and control communications network 14 that tics together the processors and the disk storage system, and (iii) a set of processing nodes 16 (see FIG. 2), each containing at least one processor 18, memory 20, and interface 22 to the data and control network 14. The capacity of the data network 14 (the amount of data it can transport in a given amount of time) scales as the number of processors increases. The size of the set of processing nodes 16 can be scaled up as computation requirements demand. On an MPP, processor nodes 16 can execute independently from one another; however, the control portion of the data and control communications network 14 provides a means by which all nodes 16 can synchronize their activities.
An MPP can improve the performance of computationally-intensive seismic processing such as migration programs, because it is possible to partition the work to be done and assign a part of the work to each processor node 16. For this approach to scale as the size of the MPP scales, the partitions must be truly independent of one another, such that no two processors share work. For example, each bin in the survey area must be assigned to one and only one processor at any one time.
A straightforward partitioning is one that assigns nonoverlapping rectangular areas of bins to processors. This satisfies the independence requirement because each bin is independent of other bins. However, many, but not all, bins will be covered by a common prestack trace. Thus, it must be possible to efficiently input the same prestack trace, which is often stored on the disk storage system 12, to multiple processors. Each processor will then transform the trace slightly differently than the other processors, and add it to the developing poststack trace associated with the assigned bin.
Two obvious trace input strategies exist. The first inputs the same prestack trace to all processing nodes 16. In general, this is an inefficient use of an MPP. Often, many nodes 16 that receive the prestack trace are assigned to bins outside that trace's coverage. If a processing node 16 were to receive such a trace, it would simply ignore it and wait until it receives the next trace. The result is inefficient use of an MPP's processors, since many processors can waste time discarding traces.
A second strategy is to determine which traces cover which bins, and input from the disk storage system different traces into different processing nodes 16, such that only traces that cover a processing node's assigned bin is sent to that processor. Although this is efficient from the perspective of processor usage, it can be inefficient from the perspective of disk usage, because it creates contention as different processors attempt to read from different parts of the disk storage system at the same time. If the disk system is implemented as a single logical disk, as in a RAID disk system, see S. J. Lo Verso, M. Isman, A. Nanopoulos, W. Nesheim, E. D. Milne, and R. Wheeler. SFS: A parallel file system for the CM-5. In Proceedings of the 1993 Usenix Conference, different processing nodes 16 will be contending for control of the disk system 12 (i.e., for control of the placement of the disk heads). This results in delays in moving data from the disk system 12 to processor memory 20.
If, by contrast, the disk system 12 is implemented as multiple disks, then there may be no contention, if each processing node 12 is reading from a separate disk. However, this disk architecture does not scale well. In general, this is implemented as a disk per processor, with fast access to the disk from the processor to which the disk is connected, and significantly slower access to the disk from other processors. See, Kendall Square Research, 1993. Technical Summary. In addition, the potential for contention for a single disk exists, if two or more processors try to read from the same disk.
Approaches are developing for applying parallel computation to seismic processing problems. See, U.S. Pat. No. 5, 198,979 Moorhead et al. and U.S. Pat. application Ser. No. 07/811,565 Pieprzak et al. (Allowed). In this application, all references to patents are incorporated by reference and all other references are incorporated by reference for background.