A communication network may for example be a packet-based network and/or an internet. A network typically includes different types of network nodes, such as user devices, routers, network address translators (NATs), proxy servers, media relay servers etc., which perform different functions within the network. For instance, routers route packets between individual networks of an internet. NATs also perform such routing, as well as performing network address translation i.e. to mask the network address of the sender. Communication between two communicating nodes, such as user devices, may be via other nodes of the network, i.e. intermediate nodes such as routers, NATs and media relay servers. Every active network interface (e.g. of a user device, server etc.) connected to the network is assigned a network address, e.g. IP (Internet Protocol) address, so that is data can be routed thereto via the network. This may for example be assigned by an ISP (Internet Service Provider) in the case of a public network, or other network administrator.
A media session may be established between two endpoints, such as user devices, connected via a communication network so that real-time media can be transmitted and received between those endpoints via the network. The endpoints run client software to enable the media session to be established. The media session may be a Voice or Video over IP (VOIP) session, in which audio and/or video data of a call is transmitted and received between the endpoints in the VOIP session as media streams. Endpoints and other types of network node may be identified by a network address, such as an IP address. A transport address is formed of an IP address and a port number identifying a port associated with the IP address. A media session being may be established between transport addresses associated with the endpoints. An example of a media session is a SIP (“Session Initiation Protocol”) media session. SIP signalling, e.g. to establish or terminate a call or other communication event, may be via one or more SIP (proxy) server(s). To this end, the SIP proxy forwards SIP requests (e.g. “INVITE”, “ACK”, “BYE”) and SIP responses (e.g. “100 TRYING”, “180 RINGING”, “200 OK”) between endpoints. In contrast to a media relay server, the media (audio/video) data itself does not flow via a basic SIP proxy i.e. the proxy handles only signalling, though it may in some cases be possible to combine proxy and media relay functionality in some cases. To establish the media session, one of the endpoints may transmit a media session request to the other endpoint. Herein, an endpoint that initiates a request for a media session (e.g. audio/video communications) is called an “initiating endpoint” or equivalently a “caller endpoint”. An endpoint that receives and processes the communication request from the caller is called a “responding endpoint” or “callee endpoint”. Each endpoint may have multiple associated transport addresses e.g. a local transport address, a transport address on the public side of a NAT, a transport address allocated on a relay server etc. During media session establishment, for each endpoint, a respective address may be selected for that endpoint to use to transmit and receive data in the media session. For example, the addresses may be selected in accordance with the ICE (“Interactive Connectivity Establishment”) protocol. Once the media session is established, media can flow between those selected addresses of the different endpoints.
A known type of media relay server is a TURN (Traversal Using Relays around NAT) server, e.g. a TURN/STUN (Session Traversal Utilities for NAT) incorporating both TURN and STUN functionality. The network may have a layered architecture, whereby different logical layers provide different types of node-to-node communication services. Each layer is served by the layer immediately below that layer (other than the lowest layer) and provides services to the layer immediately above that layer (other than the highest layer). A media relay server is distinguished from lower-layer components such as routers and NATS in that it operates at the highest layer (application layer) of the network layers. The application layer provides process-to-process connectivity. For example, the TURN protocol may be implemented at the application layer to handle (e.g. generate, receive and/or process) TURN messages, each formed of a TURN header and a TURN payload containing e.g. media data for outputting to a user. The TURN messages are passed down to a transport layer below the network layer. At the transport layer, one or more transport layer protocols such as UDP (User Datagram Protocol), TCP (Transmission Control Protocol) are implemented to packetize a set of received TURN message(s) into one or more transport layer packets, each having a separate transport layer (e.g. TCP/UDP) header that is attached at the transport layer. The transport layer provides host-to-host (end-to-end) connectivity. Transport layer packets are, in turn are passed to an internet layer (network layer) below the transport layer. At the internet layer, an internet layer protocol such as IP is implemented to further packetize a set of received transport layer packet(s) into one or more internet layer (e.g. IP) packets, each having a separate network layer (e.g. IP) header that is attached at the internet layer. The internet layer provides packet routing between adjacent networks. Internet layer packets are, in turn, passed down to the lowest layer (link layer) for framing and transmission via the network. In the reverse direction, data received from the network is passed up to the IP layer, at which network layer (e.g. IP) headers are removed and the remaining network layer payload data, which constitutes one or more transport layer packets including transport layer header(s), is passed up to the transport layer. At the transport layer, transport layer (e.g. UDP/TCP) headers are removed, and the remaining payload data, which constitutes one or more TURN messages in this example, is passed up to the application layer for final processing, e.g. to output any media data contained in them to a user, or for the purposes of relaying the TUN message(s) onwards. This type of message flow is implemented at both endpoints and TURN servers i.e. endpoints and TURN servers operates at the application layer in this manner.
An IP address uniquely identifies a network interface of a network node within a network, e.g. within a public network such as the Internet or within a private network. There may be multiple application layer processes running in that node, and a transport address (IP address+port number) uniquely identifies an application layer process running on that node. That is, each process is assigned its own unique port. The port (or equivalently “socket”) is a software entity to which messages for that process can be written so that they become available to that process. An IP address is used for routing at the internet layer by internet layer protocols (e.g. IP) and constitutes an internet layer network address that is included in the headers of internet layer packets, whereas the port number is used at the transport layer by transport layer protocols e.g. TCP/UDP to ensure that received data is passed to the correct application layer process. A transport layer packet includes a port number in the header, which identifies the process for which that packet is destined. In accordance with the TCP/IP Protocol Suite, a port number has a length of 16 bits. This means a maximum of 216−1=65535 ports can be associated with a single IP address (as port 0 is reserved).
In contrast to media relay servers, routers typically only operate at the internet layer, routing IP packets based on IP addresses in IP packet headers. Notionally, NATs also only operate at the network layer and are distinguished from basic routers in that NATs modify IP headers during routing to mask the IP address of the source. However, increasingly NATs perform modifications at the transport layer, i.e. to transport layer packet headers, so at to also mask the source port number e.g. to provide one-to-many network address translation.