Data compression is the process of transforming or “shrinking” data to a lower factor. Hence “compressed”, therefore consuming less space on storage mediums. The data can therefore be allowed to be stored and transmitted over various communications channels via modern telecommunication standards. Either of these processes can either be lossy or lossless as previously predetermined. Decompression is symmetrical, as this would be necessary to restore the compressed data back to the original state.
Many techniques have been used over the years to compress digital data. However, they all are based on the same few basic principles: a statistical coding, a dictionary coding, or a decorrelation (Storer J. A., Data Compression: Method and Theory, Computer Science Press (1993); Williams R. N. Adaptive Data Compression, Kluwer Academic Publishers (1990); Salomon D. Data Compression, Springer (1997).
A major example of statistical coding is Huffman encoding (see, Salomon D. Data Compression, Springer (1997). In this method, it is assumed that certain bytes occur more frequently in the file than others. In the general case of a binary file, produced by a random source, the frequency distribution could be close to uniform, and Hoffman compression will fail.
The dictionary algorithms are variations of the Lempel-Ziv technique of maintaining a “sliding Window” of the most recent processed bytes of data and scanning the Window for sequences of matching bytes. The input data character stream is compared character-by-character with character sequences stored in a dictionary to check for matches. One example of such a scheme is described in U.S. Pat. No. 6,075,470 (Little), entitled ‘Block-wise Adaptive Statistical Data Compression’, issued on Jun. 13, 2000, that used adaptive statistical block coding.
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement, and has the potential for very high throughput in hardware implementations. It is the algorithm of the widely used Unix file compression utility compress, and is used in the GIF image format.
These dictionary algorithms require: a) a number of repetitions of the sequence, included in the dictionary; b) inclusion of the dictionary sequence in the output, so that matching rate must be high enough to actually achieve compression; c) exact match between sequences in an input window and a dictionary. For example, the letters ‘b’ and ‘c’ do not match, and the compression will fail, while with a binary coding the difference is only one bit.
The decorrelation technique is applied to highly correlated data, like space or medical images, with wavelets or Fast Fourier Transformation, as a set of basic functions for an input image expansion. These transformations are described in details in Rao K. R., Yip P. C., Eds. The Transform and Data Compression Handbook. CRC Press (2001). If the input sequence is highly correlated, the coefficients of this transformation will decay rapidly, and the number of them could be cut-off, providing compression with some loss of information. These losses could be acceptable for a human perception of an image, but unacceptable for compression of text or executable files, which are not correlated, and when no losses are acceptable. It is also unacceptable for correlated diagnostic or intelligence images, for which the high-frequency component can have an important informative value.
One example of the decorrelation technique is described in U.S. Pat. No. 6,141,445 (Castelli et al.), entitled ‘Multiresolution Losseless/Lossy Compression and Storage of Data for Efficient Processing thereof,’ that used a lossy technique to produce the losseless compression by means of applying an orthogonal expansion (could be the wavelet expansion) to an input sequence. After an inverse transform and finding residuals between an input data and the wavelet transform. The sequence of residuals could be compressed using statistical techniques. That patent applied this approach to a general case of random binary data, disregarding the fact that it may be not correlated. However, it is not efficient in that case: the sequence of coefficients of these orthogonal transformations does not decay, and it cannot be cut-off.
The data compression process removes redundancy from the data, and this procedure could be related to the process of data encryption. A random number generator is a software program or hardware circuit that uses a recursive mathematical expression or shifted operations in a register, to produce a stream of random numbers. A random number generator is used in the prime art only to encrypt the data, but not to improve compression. See, for example, U.S. Pat. No. 6,122,379 (Barbir), entitled ‘Method and Apparatus for Performing Simultaneous Data Compression and Encryption’.
Accordingly, a need exists for systems, media, and methods for more efficient data compression. A simple, fast, and practical implementation solution found in the description below. Specifically, the shortcomings of the above-cited resources are overcome and additional advantages are provided through the provision of a method for processing data, including the cases when the prior source is impossible to use.