Computers on a network send information to each other as part of a communication session. The data for a communication session is broken up by the network and transferred from a source address to a destination address. This is analogous to the mail postal system, which uses zip codes, addresses, and known routes of travel to ship packages. If one were to ship the entire contents of a home to another location, it would not be cost effective or an efficient use of resources to package everything into one container for shipping. Instead, smaller containers would be used for the transportation and assembled after delivery. Computer networks work in a similar fashion by taking data and packaging it into smaller pieces for transmitting across a network. Each of these packets is governed by a set of rules that defines its structure and the service it provides. For example, the World Wide Web has a standard protocol defined for it, the Hyper Text Transport Protocol (HTTP). This standard protocol dictates how packets are constructed and how data is presented to web servers and how these web servers return data to client web browsers.
Any application that transmits data over a computer network uses one or more protocols. There are many layers of protocols in use between computers on a network. Not only do web browsers have protocols they use to communicate, but the network has underlying protocols as well. The use of multiple nested protocols is sometimes referred to as “data encapsulation.” For example, when a request is made to a web site, the data request is encapsulated by the HTTP protocol used by the browser. The data is then encapsulated by the computer's network stack before it is put onto the network. The network may encapsulate the packet into another packet using another protocol for transmission to another network. Each layer of the protocol helps provide routing information to get the packets to their target destination.
In order for a network administrator or other entity to analyze or monitor its users' traffic effectively, tools may be deployed to: “sniff” or capture the packets traversing the network of interest; understand the protocol(s) being used in the communication; analyze the data packets used in the communication; and draw conclusions based on information gained from this analysis. Conventional tools for analyzing network traffic include protocol analyzers, intrusion detection systems, application monitors, log consolidators, and combinations of these tools.
A conventional protocol analyzer can provide insight into the type of protocols being used on a network. The analysis tools within such an analyzer enable the analyzer to decode protocols and examine individual packets. By examining individual packets, conventional protocol analyzers can determine where the packet came from, where it is going, and the data that it is carrying. However, it is virtually impossible to look at every packet on a network by hand to see if security or other concerns exist. Consequently, more specialized analysis products were created.
One example of a more specialized but conventional analysis tool is an Intrusion Detection System (IDS), which validates network packets based on a series of known signatures. If the IDS determines that certain packets are invalid or suspicious, the IDS provides an alert. Analysts, in some cases using additional analysis tools, must then analyze most of these alerts. This analysis can require extensive manpower and resources.
Another example of a more specialized but conventional analysis tool is an application monitor. Application monitors focus on specific application layer protocols to decide if illegal or suspicious activity is being performed. Such conventional application monitors may focus, for example, on the Hyper Text Transfer Protocol (HTTP) to monitor accesses to websites. As such, when a network user visits a website, the analyst can monitor the packets transmitted and received between the network user's computer and the web server. These packets can be analyzed by parsing the HTTP protocol to determine the website's hostname, the name of the file requested, and the associated content that was retrieved. Thus, this HTTP analyzer could be used to decide if a network user is visiting inappropriate web sites and alert appropriate personnel of this activity. This type of analysis tool monitors the actions of web browsers, but falls short for other types of communications.
Another conventional application monitor can monitor the Simple Mail Transport Protocol (SMTP). This system can be used to, e.g., record and track e-mails sent outside of a company to ensure employees were not sending trade secrets or intellectual property owned by the company. It can also ensure e-mails entering into a company do not contain malicious attachments or viruses. Employees could, however, use other means of communication such as instant messaging, chat rooms, and website-based e-mail systems to circumvent detection. Because this application monitor only monitors SMTP communications, companies must also use many other security and analytical tools to monitor network activity.
Another example of a more specialized but conventional analysis tool is a log consolidator system (LCS). An LCS processes log-based output from network applications or devices. These data inputs can include firewall logs, router logs, application logs such as web server or mail server logs, host system logs, and/or IDS alerts. LCS analysis tools aggregate and correlate each different log format. However high level log data often lacks detail needed for effective analysis.
While these and other conventional network analysis systems analyze communications of a particular protocol or format, they fail to analyze a broad breadth of protocols and formats, and provide true context. Thus, an entity wishing to ensure security of its network currently must purchase and maintain multiple network analysis systems. Further, with each new protocol or protocol change, companies must create, rewrite, upgrade, or repurchase at least one of their systems. The conventional method of using a patch-work of multiple analyzers is expensive and complex to maintain.
In addition, because of the many ways to communicate over a network and the many different analysis tools needed to perform deep content and context analysis, the conventional network analysis methods make it difficult to answer even simple questions such as “What is happening on my network?,” “Who is talking to whom?,” and “What resources are being accessed?” Answering these questions is difficult because there is virtually no limit to which applications one can use. Each application introduced onto a network brings new protocols and new analytical tools to audit those applications. For example, there are many ways to send a file to another person using a network: E-mailing the document as an attachment using the SMTP protocol; transmitting the file using an Instant Messenger like MSN, AOL IM™, or Yahoo™ IM; uploading the file to a shared file server using the FTP protocol; web sharing the document using the HTTP protocol; or uploading the file directly using an intranet protocol like SMB or CIFS. All of these protocols are implemented differently and special analysis tools are required to interpret them—a complex and expensive system.
In sum, the ever-increasing amount and different possible types of network traffic is forcing network administrators, Internet service providers (ISPs), corporate information technology (IT) managers, and law enforcement personnel, among others, to re-evaluate how best to effectively monitor the types and nature of data traffic that traverses the networks for which they have oversight.
One system that provides significant insight into network traffic is described in U.S. patent application Ser. No. 10/133,392, entitled “Apparatus and Method for Network Analysis,” and filed Apr. 29, 2002 (“the '392 application”). The methodology described in the '392 application includes capturing network traffic in its native format, and converting the same into a common language, or event-based language. Records of the data can then be organized around the common language. The disclosure of the '392 application is incorporated by reference herein in its entirety.
Notwithstanding the advances and advantages of the systems and methods described in the '392 application, there is nevertheless a need for improved systems and methodologies for monitoring network traffic.