Computer systems communicate with other computer systems using defined protocols through a data communications network. For example, a significant portion of data transmitted over the Internet is web traffic sent to or received from web servers using the Hypertext Transfer Protocol (HTTP).
While such data is transmitted between computer systems, other devices may capture this traffic for later introspection. Network traffic data is commonly used by corporations, businesses, governmental agencies, internet service providers, and other organizations to analyze and inspect various communications between computer systems. The uses of the captured data may include fraud prevention, behavior analysis, security analysis, website optimization, etc.
A tremendous amount of cost and burden will be placed on those that have come to rely on network traffic data. For example, the amount of storage and costs associated with the storage demanded for storing captured network traffic will grow similarly as the amount of data constituting network traffic grows. Further, as the amount of traffic in a given time period grows, and hence the amount of bytes stored grows, the amount of time required to process, search, and analyze this data also becomes increasingly lengthy. Some organizations have attempted to change their ways of storing smaller amounts of network traffic (e.g., network traffic from shorter time periods) to account for the ever-increasing traffic. However, storing limited amounts of network traffic may reduce their ability to perform useful analysis over time, thus reducing the usefulness of the system as it is limited to only making use of small snapshots of data. An organization may not be able to determine changes in communication by limiting storage of network traffic. The amount of storage demanded for storing network traffic can be prohibitive for search and retrieval.
Therefore, it is desirable to provide new techniques to solve these challenges.