The technology described herein relates to encoding and decoding of numerical data, using specialized single instruction multiple data (SIMD) instructions for efficient storage and/or transfer of encoded data in a computing system.
In present high performance computing applications, it is often necessary to transfer vast amounts of numerical data among multiple processor cores or between processor cores and memory. The limited data transfer rates of interfaces among processor cores and between cores and memory devices can create bottlenecks for overall data processing speed and performance. In data rich applications, storage of numerical data challenges memory resources and storage devices. Reducing the demands on data transfer and storage capacity for numerical data can improve the efficiency, economy and performance of the computing system. Compression of numerical data may reduce these demands, however at the cost of additional computations. In applications having vast quantities of numerical data, it is especially important that the compression be computationally efficient in order to minimize demands on computing resources.
In present microprocessor architectures, single instruction, multiple data (SIMD) processing performs the same operation indicated by a single instruction on multiple data elements, or operands. A SIMD operation is performed in parallel on the multiple operands, rather than sequentially, thus accelerating computations. Advantages of SIMD implementations include reduced processing time over sequential processing, decreased numbers of instructions and greater processing efficiency. Implementations of SIMD technology are available from many companies, including:                Intel and AMD, whose SIMD instruction sets are commonly called MMX, SSE, and AVX,        Advanced RISC Machines (ARM), whose SIMD instruction set is called Neon,        IBM, Freescale, and Apple, whose SIMD instruction set is called AltiVec,        
The previous list of SIMD implementations is not meant to be exhaustive, only illustrative that SIMD processing has been widely integrated into microprocessor architectures. Embodiments of the present invention utilize novel SIMD constructs, and defines functionality that may be implemented in new SIMD instructions, to accelerate the process of encoding and decoding a plurality of numerical samples per instruction.
Commonly owned patents and applications describe a variety of compression techniques applicable to fixed-point, or integer, representations of numerical data or signal samples. These include U.S. Pat. No. 5,839,100 (the '100 patent), entitled “Lossless and loss-limited Compression of Sampled Data Signals” by Wegener, issued Nov. 17, 1998. The commonly owned U.S. Pat. No. 7,009,533, (the '533 patent) entitled “Adaptive Compression and Decompression of Bandlimited Signals,” by Wegener, issued Mar. 7, 2006, incorporated herein by reference, describes compression algorithms that are configurable based on the signal data characteristic and measurement of pertinent signal characteristics for compression. The commonly owned U.S. Pat. No. 8,301,803 (the '803 patent), entitled “Block Floating-point Compression of Signal Data,” by Wegener, issued Apr. 28, 2011, incorporated herein by reference, describes a block-floating-point encoder and decoder for integer samples. The commonly owned U.S. patent application Ser. No. 13/534,330 (the '330 application), filed Jun. 27, 2012, entitled “Computationally Efficient Compression of Floating-Point Data,” by Wegener, incorporated herein by reference, describes algorithms for direct compression floating-point data by processing the exponent values and the mantissa values of the floating-point format. The commonly owned patent application Ser. No. 13/617,061 (the '061 application), filed Sep. 14, 2012, entitled “Conversion and Compression of Floating-Point and Integer Data,” by Wegener, incorporated herein by reference, describes algorithms for converting floating-point data to integer data and compression of the integer data. At least a portion of the operations for compression and decompression described in these applications may be implemented using the SIMD technology described in the present specification.
The commonly owned patent application Ser. No. 12/891,312 (the '312 application), entitled “Enhanced Multi-processor Waveform Data Exchange Using Compression and Decompression,” by Wegener, publication number 2011-0078222, published Mar. 31, 2011, incorporated by reference herein, describes configurable compression and decompression for fixed-point or integer numerical data types in computing systems having multi-core processors. In a multi-core processing environment, input, intermediate, and output waveform data are often exchanged among cores and between cores and memory devices. The '312 application describes a configurable compressor/decompressor at each core that can compress/decompress integer or numerical waveform data. The '312 application describes configurable compression/decompression at the memory controller to compress/decompress integer or numerical waveform data for transfer to/from off-chip memory in compressed packets. At least some operations of the configurable compressor and decompressor of the '312 application may be implemented using the SIMD technology described in the present specification.
The commonly owned patent application Ser. No. 13/617,205 (the '205 application), filed Sep. 14, 2012, entitled “Data Compression for Direct Memory Access Transfers,” by Wegener, incorporated herein by reference, describes providing compression for direct memory access (DMA) transfers of data and parameters for compression via a DMA descriptor. The commonly owned patent application Ser. No. 13/616,898 (the '898 application), filed Sep. 14, 2012, entitled “Processing System and Method Including Data Compression API,” by Wegener, incorporated herein by reference, describes an application programming interface (API), including operations and parameters for the operations, which provides for data compression and decompression in conjunction with processes for moving data between memory elements of a memory system. The SIMD instructions described herein may be implemented for the compression and decompression operations described in the '205 application and the '898 application.
In order to better meet the requirements of higher speed data transfer, reduced memory utilization and minimal computation in many computing applications, a need exists for computationally efficient compression and decompression of numerical data using SIMD technology.