24-bit AD conversion technology is currently the most commonly used analog-to-digital conversion technology in geophysical exploration equipment, especially seismic exploration equipment. The sampled data in this study are 3-byte signed integer data after 24-bit analog-to-digital conversion. In general, the higher the AD conversion bit number, the greater the data acquisition dynamic range is, but this comes with an increases in the amount of data. In recent years, due to a large increase in the number of acquisition channels in seismic data storage and data transmission processes of distributed geophysical instruments (especially for seismic instruments), there has been a high demand to ensure high acquisition precision and improved data transmission efficiency to reduce storage space. The existing seismic data compression algorithms can be divided into 2 types: loss compression and lossless compression. Lossless compression typically uses the general computer data compression method, which employs statistical data redundancy to compress data. Additionally, lossless compression requires a certain amount of data in order to operate after data acquisition. Lossless compression is not used in real-time transmission of data flow. The loss compression algorithm for seismic data typically relies on the Fourier transform, the wavelet transform, or other mathematical methods to transform the seismic data from the time domain to other domains in order to reduce the amount of data. Loss compression methods require complex calculations, and are hindered by the fact that inverse decompression cannot achieve complete restoration of the original data. Moreover, loss compression methods require complete seismic data to operate, and such methods cannot function using small data samples. Finally loss compression methods cannot compress data streams in real-time.
In the acquisition of seismic data (except for a small amount of near-offset data and data within a short time after source excitation), the majority of data values are small, and the use of 8 or 16 bits provide a complete representation of the data. Specifically, 3 bytes of signed integer data can be used to represent each data sampling point, resulting in a waste of storage space. As such, there is a need to compress the data. Based on the size of the seismic data, 3 bytes can be compressed to 1 or 2 bytes. The difficulty of utilizing the lossless compression technique is finding a reasonable way to encode the data, so that when decompressing the data (based on the encoding rules that determine the number of bytes occupied by each data sampling point), through a reasonable way of decoding, the original data is completely restored.