Communication networks are increasingly using packet-switching techniques for interconnecting systems on both local and wide area networks. A packet is the unit of data that is routed between an origin and a destination on a packet-switched network, such as the Internet. Enterprises have come to rely on applications, such as electronic mail, that use these packed-based communications. However, enterprise networks that are interconnected with widely deployed packet-switched networks, such as the public Internet, are vulnerable to security and availability threats.
The data traffic passing through a network can be monitored in order to detect or address network security and availability concerns. Flow collection is the process of monitoring network traffic from one or many network segments and grouping related individual packets into a logical relationship known as a flow. A flow can be defined as a communication session between a distinct source address/port and/or a distinct protocol-specific property such as a source and destination address/port. A flow record distills the individual packets transmitted (and received in the case of a bi-directional flow record) as part of this communication session into one or more data records that describe the communication in a format that can be more easily analyzed for traffic analysis purposes or identification of security risks.
One conventional approach to network monitoring samples the packets that are visible on a network interface. Sampling means processing some packets and letting other packets pass the network interface or monitoring point unprocessed. While this approach reduces the number of packets that must be analyzed, enterprise policy violations (such as excessive peer-to-peer usage) may be overlooked. Furthermore, sampling may overlook important security concerns, such as Trojan or back-door programs that are listening on unknown ports.
Additionally, application analysis can be an important component of identifying security risks. For example, a secure shell (SSH) service running on an atypical port may represent a security concern. Conventional approaches to application identification only use packet header data, which does not necessarily result in an accurate identification of the communicating applications. These conventional approaches often rely on predetermined, static port assignments (such as a file transfer protocol (FTP) service listening on port 21) to identify the application associated with a flow record. However, packet header data or static understandings about port assignments are insufficient to identify accurately the applications that are involved in the communication session.
Other conventional network monitoring approaches also lack the capability to capture packet content (i.e., payload data) from the flow during the creation of flow data records. Typical network monitoring tools can be configured to capture or store packet content only after a security event has been triggered. A problem with this approach, however, is that important details about the root cause of the security event may be lost because these details were communicated at the beginning of the flow before the event was triggered.
What is needed is a flow collector that generates flow data records based on each packet that is observed at one or more network monitoring points. What is further needed is a flow data record that includes a configurable amount of packet content that can be used for analysis, such as application identification or security forensics.