The Internet is a wide area data communication network formed from a plurality of interconnected data networks. In operation, the Internet facilitates data communication between a range of remotely situated data processing systems. Typically, end user data processing systems connected to the Internet are referred to as client data processing systems or clients. Similarly, data processing systems hosting web sites and services for access by end users via the Internet are referred to as server data processing systems or servers. There is a client-server relationship completed via the Internet between the end user data processing systems and the hosting data processing systems.
The Internet has become an important communication network for facilitating electronically effected commercial interactions between consumers, retailers, and service providers. Access to the Internet is typically provided to such entities via an Internet Service Provider (ISP). Each ISP typically operates an open network to which clients subscribe. Each client is provided with a unique Internet Protocol (IP) address on the network. Similarly, each server on the network is provided with a unique IP address. The network operated by the ISP is connected to the Internet via a dedicated data processing system usually referred to as a router. In operation, the router directs inbound communication traffic from the Internet to specified IP addresses on the network. Similarly, the router directs outbound communication traffic from the network in the direction of specified IP addresses on the Internet.
The term peer-to-peer applies to the broad category of applications and protocols where clients, or peers, in a network establish communication sessions directly with each other. In contrast, client-server applications assume that clients communicate only with known servers. A characteristic of peer-to-peer applications is therefore the symmetry of roles. While in a client-server architecture it is only the client who can initiate communication to the server, in a peer-to-peer architecture, communication can be initiated by any peer.
File-sharing applications such as KaZaA and BitTorrent are examples of peer-to-peer applications. They include mechanisms for finding resources and for peer address resolution, which are used before the actual file transfer takes place. Note that these applications do rely on a server for some functions e.g. initialization, but they eventually use peer-to-peer communication for both resource finding and data transfer. Other types of peer-to-peer applications include interactive games or video-conferencing eg. using RTP, Real-time Transport Protocol.
The asymmetry of roles in client-server applications, which are the majority of Internet applications, has had many implications in the way networks are architected. For example, a firewall used to secure an intranet is typically more permissive when communication is initiated internally ie. within the intranet, assuming that it is a client trying to contact an external server. Equally, a firewall used to secure an intranet typically blocks all traffic initiated externally, unless it is directed to a server for which an exception exists. Network Address Translation, NAT, mechanisms are used to share an external IP address among clients within an intranet. NAT mechanisms may follow a similar principle to the firewalls, namely, communication has to be initiated from within the intranet so that a mapping between an internal client, having an internal address, and an external client, having an external address, be established. As a consequence of such architecture, peer-to-peer communication across intranet/Internet boundaries is not straightforward. Peer-to-peer communication that is initiated by an external client may be blocked by firewalls and NATs, although it is possible to configure specific rules and port mappings to allow such peer-to-peer communication.
Recent widespread availability of broadband internet connections for home users has lead to increased residential usage of middleboxes. Middleboxes are usually a combination of firewall and NAT functionality. Where a peer is behind a middlebox, their private address is mapped to a public address. Peer-to-peer software developers have devised techniques for establishing data communication channels which traverse middleboxes without requiring manual configuration of the middlebox by the user. Generally, in order to establish such data communication channels, the first step is to determine a transport address to use for the peer-to-peer communication, and the second step is to traverse the middlebox.
It is known in the art to perform the first step by relying on an initial connection to a well known server. Peers can find about each other by, for example, using identifiers such as aliases in an instant messaging server. Then the two peers wanting to establish a peer-to-peer communication connect to the well known server and transmit their current transport addresses. The server's reply contains the other peer's address information. Note that the address information in this exchange generally includes both private and public addresses. Private addresses are included so that peer-to-peer communication can be established where both peers are in the same address space, since some NATs do not provide “loopback translation”.
Generally the second step is performed using techniques such as e.g. User Datagram Protocol, UDP, hole punching technique, or others as described in “Peer-to-Peer communication across Network Address Translators” by B. Ford, P. Srisuresh and D. Kegel, USENIX Annual Technical Conference, Apr. 10-15, 2005. UDP hole punching technique works in the following way. When a first peer and a second peer “simultaneously” send each other a session initiation UDP packet, they continue sending it until a response packet is received. This is done using both the private and the public address. If the two peers are located in the same address space, the packet sent to the private address is accepted immediately, a response is sent and the peers can start communicating following the protocol specific to the application. If the first peer is located behind a middlebox, then the packet sent by the second peer to the first peer's private address is meaningless (and probably not even routed). However, because the first peer sent a packet with destination address set to the second peer's public address, the middlebox assumes that incoming packets from the second peer are part of a peer-to-peer communication initiated internally. The packet sent by the second peer to the first peer's public address is therefore accepted, and it is noted that session initiation packets are retransmitted so as to account for cases where the endpoint is not yet opened when the first packet arrives. The same process occurs for the second peer.
The above described UDP hole punching technique only works with NATs that reuse port bindings for different invocations, and NATs/firewalls that open a UDP endpoint for receiving packets of an external peer when the same endpoint has been used for sending packets to that peer.
It is noted that UDP hole punching technique is not always successful due to the variety of middlebox behaviors. Although in many cases it does allow peering sessions to be established even when both peers are behind a middlebox.
Improper usage of peer-to-peer applications in enterprise networks is a problem with several negative consequences. Firstly, these applications introduce new traffic patterns which can impact performance and availability of the enterprise's networking resources. Significantly, they represent a legal risk and a potential security exposure. The former is due to possible copyright infringement claims associated to file-sharing; the latter is due to the danger of confidential information being inadvertently disclosed by misconfigured/malicious peer-to-peer software, and to the possible introduction of viruses contained in shared files.
Known approaches to tackling this problem are based on analyzing traffic statistics at the network boundaries, and generally rely on the usage of well-known peer-to-peer UDP and Transport Layer Protocol, TCP, ports. However, due to the dynamic nature of peer-to-peer networks and the increasingly sophisticated techniques to traverse controls such as firewalls, these approaches do not accurately detect and control peer-to-peer software.
It is an aim of the present invention to provide a system for detecting and controlling peer-to-peer communication across intranet/Internet boundaries. It is a further aim to provide local reporting of local problems. In addition the detection can be realized transparent to the internal and external peers.