With the advent of file sharing applications such as KaZaA, Gnutella, BearShare, and Winny, the amount of peer-to-peer (P2P) traffic on the Internet has grown immensely in recent years. In fact, it has been estimated that P2P traffic now represents about 50-70 percent of the total traffic on the Internet. This is so despite the fact that the number of P2P users is quite small compared to the number of non P2P users. Thus, it appears that most of the bandwidth on the Internet is being consumed by just a minority of the users. For this and other reasons, P2P traffic is viewed by ISP's (Internet service providers) and others as being abusive/misbehaving traffic that should be controlled and penalized.
In order to control P2P traffic, however, it first needs to be identified. Earlier generations of P2P protocols used fixed TCP port numbers for their transmissions. For example, FastTrack used TCP port 1214. This made P2P traffic easy to identify. Current P2P protocols, however, no longer have to use fixed port numbers. Rather, they can be configured to use random dynamic port numbers so that P2P traffic can now be masqueraded as other types of traffic, such as HTTP web browsing and unspecified TCP traffic. As a result, the current P2P protocols have rendered the port-based identification techniques ineffective.
Another technique that has been used to identify P2P traffic involves the use of signatures. Specifically, it was observed that some P2P protocols inserted distinct information into their data packets. Using this distinct information as a signature, it was possible to identify packets that were assembled using those P2P protocols. This technique has several problems. First, it usually is effective for only a relatively short period of time. As the P2P protocols evolve and mutate (which they do on a fairly constant basis), their signatures change. Once that happens, the previous signatures are no longer valid, and the technique will have to be changed to recognize the new signatures. Another and more serious problem is that the P2P protocols are now evolving to the point that they either leave no signature or they obfuscate their signatures (e.g. by encryption). This makes it extremely difficult if not impossible to identify P2P traffic using signatures.
Overall, P2P protocols have gotten quite sophisticated, and the more sophisticated they become, the more difficult it is to identify P2P traffic. Unless P2P traffic can be identified, it cannot be effectively controlled.