1. Field of the Invention
The present invention relates to computers and computer networks. More particularly, the invention relates to classifying network traffic in the computer network.
2. Background of the Related Art
Identifying the flows generated by different application-layer protocols is of major interest for network operators. For Internet service providers (ISPs), identifying traffic allows them to differentiate the QoS (quality of service) for different types of applications, such as voice applications and video applications. Moreover, it enables them to control high-bandwidth and non-interactive application, such as peer-to-peer (P2P) applications. For enterprise networks, it is very important for administrators to know activities on their network, such as services that users are running, the application dominating network traffic, etc. Traffic classification is also important for securing the network. In fact, even traditional protocols are often used as means to control attacks, such as the use of IRC (Internet Relay Chat) to mange the C&C (command and control) nodes for botnets. Overall, traffic classification is the first step in building any kind of intelligence on a network.
Despite the significant research efforts for solving the network traffic classification problem, many deployed solutions rely heavily on payload and deep packet inspection (DPI) techniques. Payload-based techniques fail to classify encrypted traffic, and require consistent maintenance and updating of signatures, which is an expensive and time consuming process. In addition, it is often desirable to classify traffic that does not contain any payload and is summarized in the form of flow records or packet headers. At the same time, many applications, such as peer-to-peer (P2P), often randomize their ports, thus rendering port-based classification unreliable.
Throughout this disclosure, the term “flow” refers to a sequence of packets from a source node to a destination node in the network. Generally, a flow is represented by a 5-tuple of <source IP address, destination IP address, source port, destination port, protocol>. In particular, the protocol in the 5-tuple refers to a layer 4 (i.e., transport layer) protocol, such as TCP, UDP, ICMP, etc. Further, the terms “application” and/or “application class” refer to a layer 7 (i.e., application-layer) protocol with a distinct documented behavior in terms of communication exchanges, control packets, etc. Examples of such application include HTTP, SMTP, MSN, BitTorent, Gnutella, POP3, MSN, EDonkey, Telnet, Samba, Yahoo im, etc. Moreover, the term “application” may be referred to as the label or the class of the flow depending on the context.