A communication network typically includes different types of network nodes, such as user devices, routers, network address translators (NATs), media relay servers etc., which perform different functions within the network. Communication between two communicating nodes (endpoints, such as user devices) may be via other nodes of the network (intermediate nodes, such as routers, NATs and media relay servers). The network may have a layered architecture, whereby different logical layers provide different types of node-to-node communication services. Each layer is served by the layer immediately below that layer (other than the lowest layer) and provides services to the layer immediately above that layer (other than the highest layer). The network may be a packet-based network and/or an internet.
A media session may be established between two endpoints, such as user devices, connected via a communication network so that real-time media can be transmitted and received between those endpoints via the network. An example of a media session is a SIP (“Session Initiation Protocol”) media session. The media session may be a Voice or Video over IP (VOIP) session, in which audio and/or video of a call is transmitted and received between the endpoints in the VOIP session. Endpoints and other types of network node may be identified by a network address (e.g. IP (“Internet Protocol”) address), with the session being established between transport addresses associated with the endpoints. A transport address is a combination of a network address (e.g. IP address) and a port associated with that network address.
To establish the media session, one of the endpoints may transmit a media session request to the other endpoint. Herein, an endpoint that initiates a request for a media session (e.g. audio/video communications) is called an “initiating endpoint” or equivalently a “caller endpoint”. An endpoint that receives and processes the communication request from the caller is called a “responding endpoint” or “callee endpoint”. Each endpoint may have multiple associated transport addresses e.g. a local transport address, a transport address on the public side of a NAT, a transport address allocated on a relay server etc. During media session establishment, for each endpoint, a respective address is selected for that endpoint to use to transmit and receive data in the media session. For example, the addresses may be selected in accordance with the ICE (“Interactive Connectivity Establishment”) protocol. Once the media session is established, media can flow between those selected addresses of the different endpoints. To select a path, a list of so-called “candidate pairs” is generated, each of which comprises a network address available to a first of the endpoint—“local” candidates from the perspective of the first endpoint, though note that “local” in this context is not restricted to host addresses on its local interface, and can also include reflexive addresses on the public side of the NA, or a relay network address of a media relay server that can relay media data to the first endpoint—and a network address available to the second endpoint (“remote” candidates from the perspective of the first endpoint). Every possible pairing of local and remote candidates may be checked to determine whether or not it is valid, by sending one or more probe messages from the local address to the remote address during so-called “connectivity checks”.