Seismic imaging is the process of converting acoustic measurements of the Earth into images of the Earth's interior, much like ultrasound for medical imaging. It is widely used in oil and gas exploration and production to identify regions that are likely to contain hydrocarbon reservoirs and to help characterize known reservoirs to maximize production. These methods have become critical to the energy industry as known reserves are used up and new reserves become increasingly difficult (and expensive) to find and are increasingly in technically challenging areas, like the deep sea. For the past several decades, the energy industry has tried to balance the need to image quickly and the need to image accurately. The need for accuracy is driven by the high cost of drilling a “dry” well due to poor imaging (a deep sea well can cost over $100 million) and the need for quick imaging is driven by the cost of not finding new reserves (i.e., bankruptcy). To minimize these costs, the industry relies on supercomputing clusters and regularly increases compute power, enabling both faster imaging on existing algorithms and the practical implementation of more accurate imaging. Thus, the development of fast, efficient methods for imaging is of high importance to the industry.
Seismic imaging data varies widely depending on how and where the data is collected (e.g., on land, at sea, at the ocean surface, at the ocean floor, below ground, electromagnetically, etc). One data collection method in particular implements a towed hydrophone receiver arrays for ocean seismic data collection. The basic idea is shown in FIGS. 1 and 2 depicting a ship 200 shown towing a 2D array 212 of hydrophones spaced about every 25 m on 1 to 16 trailed streamers. Every 15 or so seconds, an air cannon is fired into the water creating an acoustic wave, e.g., a shot 209 that propagates through the water and into the Earth. As shown in FIG. 2, reflections 202 in a forward direction from various surface and subsurface boundaries cause echoes that reflect back and are recorded by each hydrophone in the array 212. The recording of a single hydrophone in time as a trace and the collection of traces for a single firing of the air cannon is called a common shot gather, or shot 211. As the ship 200 moves, a large set of spatially overlapping shots is recorded. Depending on the size of the survey region to be imaged, this data collection can take a month or more and is designed to get the maximal coverage of the area to be imaged. The receiver data R(x, y, z, t) collected for potentially hundreds of thousands of shots is the result of some source data S(x, y, z, t) from a shot 209 at a particular location.
Seismic imaging data varies widely depending on how and where the data is collected (e.g., on land, at sea, at the ocean surface, at the ocean floor, below ground, electromagnetically, etc). One data collection method in particular implements a towed hydrophone receiver arrays for ocean seismic data collection. The basic idea is shown in FIGS. 1 and 2 depicting a ship 200 shown towing a 2D array 212 of hydrophones spaced about every 25 m on 1 to 16 trailed streamers. Every 15 or so seconds, an air cannon is fired into the water creating an acoustic wave, e.g., a shot 209 that propagates through the water and into the Earth. As shown in FIG. 2, reflections 202 in a forward direction from various surface and subsurface boundaries cause echoes that reflect back and are recorded by each hydrophone in the array 212. The recording of a single hydrophone in time as a trace and the collection of traces for a single firing of the air cannon is called a common shot gather, or shot 211. As the ship 200 moves, a large set of spatially overlapping shots is recorded. Depending on the size of the survey region to be imaged, this data collection can take a month or more and is designed to get the maximal coverage of the area to be imaged. The receiver data R(x, y, z, t) collected for potentially hundreds of thousands of shots is the result of some source data S(x, y, z, t) from shots 209 at particular locations.
Two critical requirements drive production seismic imaging: The need for improved imaging to accurately locate and characterize elusive oil and gas reservoirs; and the need for timely subsurface imaging. Drilling too soon risks expensive dry wells while drilling too late risks delaying the time to oil. To minimize these risks, the industry regularly increases the power of its supercomputing clusters, enabling both faster imaging on existing algorithms and the practical implementation of more accurate imaging. However, raw supercomputing power is not enough. It is equally important—if not more so—to implement algorithms efficiently based on a detailed knowledge of the hardware.
The Reverse Time Migration (RTM) algorithm (see, e.g., Gray, S. H., Etgen, J., Dellinger, J., Whitmore, D., Seismic migration problems and solutions, Geophysics 66, 1622, 2001) is widely used in the industry because of its superior imaging accuracy for difficult subsurface structures like salt domes which are poorly imaged by other algorithms but which are very effective at trapping oil and gas. Several variants of RTM exist with differing degrees of approximation to reality, all of which use single-precision arithmetic.
In seismic imaging techniques has processed shot gathers individually on a single compute node so that they could be processing in parallel. This approach has many benefits; but for algorithms like RTM, as computational power increases, data I/O becomes the single largest performance bottleneck, particularly when the model is large.
The RTM algorithm arises from the observation that pressure waves should be correlated at reflection boundaries; so RTM proceeds by correlating two pressure waves (called the forward and backward waves) to find those boundaries. To generate the waves for correlation, RTM simulates wave propagation using the wave equation for a wave that includes: generating a forward wave calculation PS(x, y, z, t) from equation 1) given a source data term S(x, y, z, t) representing the source “shot” or sound wavefield 209 as shown in FIG. 2,
                                          [                                                            ∂                  x                  2                                ⁢                                  +                                                            ∂                      y                      2                                        ⁢                                          +                                                                        ∂                          z                          2                                                ⁢                                                  -                                                      1                                                                                          v                                2                                                            ⁡                                                              (                                                                  x                                  ,                                  y                                  ,                                  z                                                                )                                                                                                                                                                                                                    ⁢                              ∂                t                2                                      ]                    ⁢                                    P              S                        ⁡                          (                              x                ,                y                ,                z                ,                t                            )                                      =                  S          ⁡                      (                          x              ,              y              ,              z              ,              t                        )                                              1        )            and generating a backward (or reverse) wavefield calculation PR(x, y, z, t) simulating backward wave propagation 204 using the wave equation from equation 2) given the receiver data R(x, y, z, t) collected by the receiver arrays 212 according to:
                                          [                                                            ∂                  x                  2                                ⁢                                  +                                                            ∂                      y                      2                                        ⁢                                          +                                                                        ∂                          z                          2                                                ⁢                                                  -                                                      1                                                                                          v                                2                                                            ⁡                                                              (                                                                  x                                  ,                                  y                                  ,                                  z                                                                )                                                                                                                                                                                                                    ⁢                              ∂                t                2                                      ]                    ⁢                                    P              R                        ⁡                          (                              x                ,                y                ,                z                ,                t                            )                                      =                  R          ⁡                      (                          x              ,              y              ,              z              ,              t                        )                                              2        )            
The forward wavefield computation corresponds to the waves 202 generated from the air cannon firing and propagating forward in time using a “velocity model” represented by V(x,y,z), which specifies the wave velocity at each point in space and represents the various material properties and boundaries of the volume being imaged. The air cannon firing is treated as a wavelet impulse localized in time and space. As shown in FIG. 2, the backward wavefield computation is generated by using the shot data recorded (received) by the hydrophone array as the source term for the wave equation and propagating that as backward propagating waves 204 backward in time. These two waves are then multiplied point-wise at each time step to generate an image, using the following “imaging condition” as shown in equation 3) as follows:I(x,y,z)=ΣtPS(x,y,z,t)PR(x,y,z,t)  3)where PS(x, y, z, t) is the reflected forward power pressure wave at a coordinate and PR(x, y, z, t) is the reverse power pressure wave at an x, y, z coordinate at a time t.
This process is repeated for all shots in the seismic survey and the images generated are summed to create a final image of the reflecting boundaries, which represent the subsurface structure. It is important to note that the time summation in the imaging condition implies that the first time step of the forward wave needs to be correlated with the last time step of the backward wave. This constraint is typically handled in one of two ways: either the forward wave is saved to disk (called a “snapshot”) every several time steps and read in for imaging when the backward wave is computed, or the forward propagation is run twice—once forward in time and once in reverse time using boundary data saved from the forward pass to recreate the forward pass in reverse—and then imaging proceeds with the backward wave and the reverse forward wave.
Both Reverse-time Migration (RTM) and Full Waveform Inversion (FWI) are the state of the art techniques in seismic imaging as they offer multiple improvements over other seismic exploration techniques in accuracy and quality, especially when reserves located in structure formation with large velocity contrast, steep dipping, and salt. However, applying them in oil exploration requires to process very large dataset in the amount of tens of terabytes and intensive computation. Even with advanced parallel computer systems, the prior art implementations of these techniques typically take weeks or months to complete the calculation and generate a good quality seismic subsurface structure image for a model with a volume of 1000×1000×600 points.
One of the major problems of the prior art computing systems is the storage of the intermediate results. For example, in its calculation process, RTM will generate intermediate results in the amount of terabytes that need to be kept for later use. In the prior art solutions, the intermediate results are transmitted out to disk storage and fetched in when needed. Because the disk input/output (I/O) operations are much slower than the parallel computing process, the prior art RTM computing performances are bounded by the disk I/O operation speed and hence result in much slower run-time performance.
The prior art solution to reduce the amount of data through the disk I/O is to compress each wavefield data before it is written to the disk and decompress it after it is read in. To prevent the data fidelity loss, lossless compression is typically used. However, the state of the art lossless compression can only achieve compression ratio in a range around 2˜4 to 1 on wavefield data. So the disk I/O operations for output and input the intermediate data still become the performance bottleneck. Using lossy compression may achieve higher compression ratio but it is not desirable because of the data fidelity loss. Therefore, a new method for intermediate data compression is desirable to improve the performance over the prior art solutions.