Data compression is an effective tool to manage ever growing data demands, enabling significant reductions for in-memory data sizes, storage footprints, and network bandwidth usage. However, data compression may add significant processing overhead, particularly when the compression codec is based on common compression codecs based on the Lempel-Ziv paradigm, such as ZLIB and LZO. These codecs tend to prioritize higher compression ratios for smaller disk storage footprints, rather than minimizing processing overhead.
For demanding, high bandwidth applications that utilize big data sets, including database management systems (DBMSs) for large enterprises or high-performance computing, performance is often measured and evaluated by query latency times and associated query throughput. To minimize such latency, data to be processed is cached in-memory when possible. Since the latency of reading data from disk storage is eliminated when data is cached in-memory, data compression may now comprise a significant portion of the latency. While common compression codecs can be implemented in hardware for reduced latency, the complexity of the algorithms may increase hardware development and fabrication costs. Furthermore, these compression codecs are typically designed for sequential access to data and may be unsuitable for supporting highly random data access, including online transaction processing (OLTP) workloads.
Based on the foregoing, there is a need for a method to provide cost effective, high performance data compression suitable for latency sensitive applications and/or highly random access workloads.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.