Generally, data compression involves taking “symbols” from an input “text”, processing the symbols, and writing “codes” to a compressed file. Most data compression methods in common use today fall into one of two categories: (1) Dictionary-based schemes; and (2) Statistical methods. Both existing known compression methods have usually two key objectives: (1) reduce to the smallest size, i.e. analyze the source text and compress the data to its smallest possible representation; (2) fast decompression, i.e the method should enable replay, transmission or reading of the data from its compressed form quickly.
Dictionary based compression systems operate by replacing groups of symbols in the input text with fixed length codes. A well-known example of a dictionary technique is LZW data compression. LZW operates by replacing strings of essentially unlimited length with codes that usually range in size from 9 to 16 bits.
Statistical methods of data compression take a completely different approach. They operate by encoding symbols one at a time. The symbols are encoded into variable length output codes. The length of the output code varies based on the probability or frequency of the symbol. Low probability symbols are encoded using many bits, and high probability symbols are encoded using fewer bits.
Both methods spend most of their time investigating the source text and analyzing it to find the patterns that can be represented in another (smaller size) format. For high performance applications this approach does not work. To achieve the goal of highest speed in compression no time can be spent in statistical analysis or directory creation. In order to send data over a network, store it on disk and memory, fast compression is critical because it takes longer to send or store than to compress. This means there is almost no time for analysis of source data and still there needs to be a significant advantage in size.
Accordingly, it would be desirable to save time (not space) due to a lower amount of data transferred/stored, but without the high latency introduced by the conventional techniques. This benefits network transfer, disk or memory storage, cache storage or any other real-time applications where time plays a crucial role.