1. Field of the Invention
This invention relates to compression of electronic data. More specifically, it relates to recursive-capable lossless compression of highly random electronic data.
2. Description of the Prior Art
Compression of electronic data is essential to normal functionality of modern information technologies. As the amount of digital data and the sizes of computer files increase at an unprecedented rate, the need for efficient compression of electronic data is more pressing than ever. Current compression technologies, however, are limited in their capabilities.
Most compression algorithms focus on removing entropy—repeated or similar data fragments—from the original data. Numerous compression technologies based on this scheme are widely known and used in the art. They include the use of dictionaries, tables of previously encountered data, and partial matches of previous data. Hashes of the previous data have been used to improve efficacy of compression technologies, and multithreaded engines have been used to decrease the time required to compress the datasets. However, to date, little has been done to approach compression of electronic data that is highly random and has little to no entropy, such as JPEG2000 and NITF files.
Technologies for compression of electronic data may be divided in two broad categories: lossy and lossless. Lossy compression inevitably introduces a degree of degradation to the original data—meaning that an exact replica of the original data cannot be recovered after the original data is compressed. Although lossy compression can be quite beneficial for some uses, degradation of original data is a major drawback that renders lossy compression ineffective for many applications.
Lossless compression, on the other hand, permits the original electronic data to be compressed and decompressed without any degradation. A number of lossless compression algorithms are known in the art. Most rely on run-length encoding, or a modified version thereof, to exploit redundancy in the electronic data. During compression, the blocks of data having a particular value are substituted with a smaller number of key bytes, thus reducing the size of the data. The key bytes contain both the value of the byte and the number of bytes in the run. The key bytes dictate the number of times the data byte value must be expanded or duplicated to obtain the original data from the compressed data. An example of lossless compression method implementing this technique is the Lempel-Ziv-Welch method patented in U.S. Pat. No. 4,558,302.
The lossless compression methods known in art have some significant limitations. Although they can be quite effective for compressing data with a high amount of entropy, such as text files and word-processor documents, they are ineffective and often counterproductive for compression of highly random data, such as JPEG2000 and NITF. Moreover, even for electronic data with high entropy, once the entropy is reduced after lossless compression, currently available methods cannot compress such data any further.
Accordingly, what is needed is a method of lossless compression of electronic data capable of recursive compression.