1. Field of the Invention
The present invention relates to computers and computer networks. More particularly, the invention relates to compressing data in a network.
2. Background of the Related Art
Recent years have witnessed a sudden increase in Internet Service Provider's (ISP's) demand of content-rich traffic information to enable novel IP applications such as real-time marketing, traffic management, security and lawful Intercept (i.e. Internet surveillance), etc. Typically, such applications are hosted at a centralized processing center to which the traffic data is transferred for further processing. Thus, raw traffic and/or meta-data events exported from the monitoring stations to the central processing center compete heavily for network bandwidth with applications being used by commercial ISP's customers. At the same time, carriers have manifested clearly a strong desire not to just collect and analyze traffic, but also to store the exported information for analyzing trends of application usage and user behavior over time. This information is useful for understanding the popularity of a specific IP application or service over time, trends of security vulnerabilities being exploited, etc. More recently, carriers have been asked by government agencies to store specific data for many years in their facilities before getting discarded. An example of such requirement related to data retention, which requires layer-4 information and key packet payload information to be stored for all carrier's customers. All the above translates into huge storage requirements for carriers, for example TCP/IP header collected in an hour on a 10 Gb/s link can easily require 3 Terabytes of storage capacity.
Despite the development and advancement of the data compression techniques (e.g., Gzip, Bzip, Pzip, Fascicle, ItCompress, Spartan, TCP/IP header compression techniques, etc.) developed for data base application and network traffic application, there remains a need to provide techniques to achieve high compression ratio for lossless real-time data compression for network traffic data and even higher compression ratio for network archive data. It would be desirable that such technique can utilize the same algorithm for both online compression of real-time traffic data and offline compression of archive data, analyze internal structure of network data to improve real-time compression ratio, determine the compression plan based on a offline training procedure, and apply the compression plan to both header and payload of the network data packets.