1. Field of the Invention
The present invention is generally related to computer networks. More particularly, the present invention is related to systems and methods for compressing packet data.
2. Related Art
Presently, data compression is useful in many applications. One example is in storing data. As data is compressed to a greater extent, more and more information can be stored on a given storage device. Another example is in transferring data across a communication network. As bandwidth in communication networks is generally viewed as a limited resource, minimizing a size of units of data being sent across the communication network may increase performance of the communication network.
One class of data compression is known as lossless data compression. Lossless data compression allows exact copies of original data to be reconstructed from compressed data. Lossless data compression is used, for example, in the popular ZIP file format and in the Unix tool gzip. Additionally, some image file formats, such as PNG or GIF, use lossless data compression.
A popular technique for lossless data compression is known as LZ77. The basis for LZ77 was developed in 1977 by Abraham Lempel and Jacob Ziv. LZ77 is a substitutional compression algorithm, which operates by effectively identifying repeated patterns in an original version of a data file (or other unit of data) to be compressed, removing the repeated patterns, and inserting pointers to previous occurrences of the repeated patterns in the data file. The pointers may each include a pair of numbers called a ‘length-distance pair,’ which may sometimes be referred to as a ‘length-offset pair.’ The length may specify a length of a repeated pattern being removed, whereas the distance or offset may be indicative of a separation between the first occurrence of the repeated pattern and a subsequent occurrence of the repeated pattern being removed. The length and distance may be provided in various manners such as in bytes or characters. The resulting compressed data file may be significantly smaller than the original version of the data file. However, the compressed data file can be decompressed such that the resulting data file is an exact copy of the original version of the data file.
A degree of compression may be expressed as a ratio of a size in bytes of the original version of the data file to a size in bytes of the compressed data file. A factor that affects the degree of compression attainable in substitutional compression methods, such as LZ77, is repetitiveness of the data to be compressed. In other words, more repetitive data can be compressed to a greater degree relative to less repetitive data because there are more occurrences of repeated patterns. Statistically speaking, larger data files are more repetitive than smaller data files. Thus, larger data files can generally be compressed to a greater degree relative to smaller data files using existing methods.
Commonly, data that is transferred across communication networks is divided into packets, also known as datagrams. A packet may be described as a unit of information transmitted as a whole from one device to another via a communication network. In packet switching networks, for example, a packet may be described as a transmission unit of fixed maximum size that consists of binary digits representing both data and a header. The header may contain an identification number, source and destination addresses, and error-control data. To illustrate, a file may be sent by a sending device on one side of a communication network to a receiving device on another side of the communication network. Prior or concurrent to sending, the file may be divided into packets. Subsequently, the packets may be received and reassembled by the receiving device to obtain the file.
Lossless data compression methods exist for compressing data from individual packets, such as IP payload compression protocol (IPComp) defined in RFC 3173. Since packets may be dropped or received out of order, these methods are not interdependent on other packets being sent. IPComp, for instance, compresses a given packet based on repetitive data included in that given packet. In other words, pointers of a compressed version of the given packet only point within the given packet. Because packets typically include a relatively small amount of data, the degree to which the packets can be compressed using IPComp and other existing methods may be limited as explained above.