There is a dramatic increase in peer-to-peer applications running over the Internet and enterprise IP networks during the past few years. The P2P applications include P2P content distribution applications like Bit-torrent, Bit-comet and E-donkey, etc., and P2P streaming applications like PPlive, PPstream, Sopoast and so on. These applications constitute a large share of the total traffic in networks.
Network operators of both the Internet and enterprise networks require an ability to identify various P2P applications and their associated traffic in order to achieve network operations and management, traffic engineering, capacity planning, provisioning and cost reduction. For instance, by rate-limiting or blocking P2P traffic, an enterprise should ensure the good performance of critical applications. Broadband ISPs would like to limit the P2P traffic to reduce the cost charged by upstream ISPs.
There are several existing approaches to identify the P2P traffic. Network port-based identification was used and seemed to be effective in earlier days because at that time, most P2P applications adopted default and fixed transport-layer port numbers. However, nowadays, it is found that substantial P2P traffic is transmitted over a large number of non-standard ports, making default port-based identification useless.
Signature-based identification is designed to reliably identify P2P applications. It requires checking packet-payload to find application-specific signatures. However, due to hardware resource limitations, payload encryption by applications, privacy and legal issues and those similar practical problems, it is a difficult task to obtain the packet-payload.
It is known that, P2P applications have their special behaviors because of their Peer-to-Peer characteristics, compared with those traditional applications like DNS, E-mail and Web. Besides those special behaviors different from those traditional applications, P2P content distribution and P2P streaming are still different from each other in the sense of the special behaviors.
There are two kinds of periodic behaviors of the P2P applications. One is peer selection or peer changing related behaviors, which both P2P content distribution and P2P streaming applications have. For P2P content distribution applications, peers run choking and optimistic unchoking periodically in order to keep effective neighbors. For P2P streaming applications, peers also apply peer selection algorithms but not choking and optimistic unchoking. They select peers based on the neighbors' responses to the buffer information they sent out before. Another kind of the periodic behavior is that a peer which is running a P2P streaming application periodically sends out its streaming buffer information (Buffer Map) to quite a number of its neighbor peers, which causes a sudden increase of the number of concurrent connections between the peer and different remote hosts in a short period.
Recently, a novel approach called BLINC is proposed to identify Internet applications only using flow-level information generated by a current flow collector (Cisco Netflow etc.). BLINC shifts the focus from identifying each individual flow to associating Internet hosts with applications. The novelty is to identify hosts by capturing the fundamental patterns of their behaviors at the transport layer. However, BLINC can identify the type of an application (WEB, DNS, FTP, ATTACK or P2P) only, without any ability to tell what specific software (P2P content distribution such as Bit-torrent, Bit-Comet, etc. and P2P streaming such as PPLive, PPstream, etc.) is.