SIP is a standardised signalling protocol for controlling communication sessions over a network. It enables sessions with one or more participants to be created, modified and/or terminated and it is widely used to control multimedia communications over Internet Protocol (IP) networks. For example, it can be used to control Voice over IP (VoIP) communications, instant messaging, chat, games and video communications between multiple entities, supporting both unicast and multicast sessions. It is also used as the signalling protocol in 3rd Generation Partnership Project (3GPP) standards such as the IP Multimedia Subsystem (IMS) architecture.
SIP is an application layer protocol and as such is designed to be used independently of underlying transport protocols. Its syntax is text-based and is modelled on that of HyperText Transfer Protocol (HTTP). SIP services are provided by SIP servers. A SIP server is typically a computing device configured to receive and process SIP messages. SIP messages comprise either a request or a response. A SIP transaction comprises a request that is sent to a SIP server and that invokes a particular method or function on the server. This method or function then results in a response which is sent by the SIP server in reply to the request. SIP servers may comprise, amongst others, logical end-points, known as “user agents”, proxy servers, redirect servers, registration servers and/or gateway devices such as session border controllers. User agents create and/or receive SIP messages. User agents may be software-based, for example a so-called “softphone” operating on a personal computer, or may comprise an embedded system forming part of a hardware device such as an IP phone. Proxy servers are typically used to route SIP requests. Redirect and registration servers support user mobility. Session border controllers control signalling between two networks. SIP servers may be coupled to a Plain Old Telephone Service (POTS) using a media gateway.
An example of a SIP transaction will now be described with reference to FIG. 1. FIG. 1 shows an exemplary SIP infrastructure 100. FIG. 1 shows two user agents: a softphone implemented on a laptop computer 110 and a SIP phone 150. FIG. 1 also shows three SIP servers: a first SIP proxy server 120, a SIP registration server 130 and a second SIP proxy server 140. The elements of FIG. 1 may be connected via one or more networks such as the Internet.
FIG. 1 also shows an exemplary procedure for placing a voice “call” using SIP messages. Assume softphone 101 and SIP phone 150 are operated by two users: A and B. Both A and B have corresponding SIP Universal Resource Identifiers (URIs). These URIs are used to make a call and may resemble an email address with a username and a host name. A URI may be mapped to a physical device using a registration procedure. In FIG. 1, B sends a REGISTER request 200 to SIP registration server 130, the request identifying B's URI and an address of SIP phone 150. In response to the REGISTER request, the SIP registration server 130 associates B's URI with the address of the SIP phone 150, such that the second proxy server 150 knows to locate B at SIP phone 150. The address may be an IP address. The SIP registration server 130 responds with an acknowledgement 201.
To place a call A sends an INVITE request 205 with B's URI to the first SIP proxy server 120. The first SIP proxy server 120 resolves the host name in B's URI and forwards an INVITE request 206 to the second SIP proxy server 140, which may be located based on the host name in B's URI. On receipt of the INVITE request 206, the second SIP proxy server 140 may make use of SIP registration server 130 to locate user B, as shown by messages 207 and 208. Further location services may also be used to locate B. The SIP registration server 130 may inform the second SIP proxy server 140 that user B is located at SIP phone 150, enabling it to forward the INVITE request 209 to SIP phone 150. Each SIP proxy server 120 also sends a “Trying” response (not shown) in reply to each INVITE request and a “Ringing 180” response may be sent to A via the two SIP proxy servers 120 and 130 indicating that B's SIP phone 150 is ringing. If B answers SIP phone 150, an OK response is sent to A via the two SIP proxy servers 120 and 130, enabling a media session 220 to be set up between the softphone 110 and SIP phone 150. As SIP is a signalling protocol, media sessions are set up using an appropriate media transfer protocol, typically Real-time Transport Protocol (RDP) over User Datagram Protocol (UDP) or Transmission Control Protocol (TCP).
In the example of FIG. 1, communications 200 to 212 comprise SIP messages. SIP messages may comprise headers and values, all specified as strings. The body of a SIP message may contain a description of a session, for example using a format specified by the Session Description Protocol (SDP). Encryption may also be used, which is referred to as SIPS.
SIP servers are often exposed to large numbers of SIP messages. For example, a typical call volume may be millions of calls per hour, where each call typically requires multiple SIP messages. In certain circumstances, a SIP server may be exposed to higher than average SIP message volumes. For example, a popular live television programme may provide a time-limited telephone voting service; this may result in a sudden increase in call volumes, a so-called “mass calling” event. In this case, the increased traffic is legitimate, i.e. it does not relate to a malicious attack, and so a SIP server must invest its limited resources such that the lowest number of users experience disruptions in service.
Alternatively, a SIP server could be exposed to a malicious Denial of Service (DoS) attack. In this case, malicious parties may purposefully direct a large number of illegitimate messages toward a server. The aim of a DoS attack is often to overload a SIP server: the processing resources of the server are directed to handling the large number of illegitimate messages at the expense of messages from legitimate users. This may result in a loss of service, poor call quality or delays to legitimate users. DoS attacks may be distributed, so-called Distributed Denial of Service (DDoS) attacks. In this case, a malicious party may infect a plurality of computing devices with code that adapts the computing devices to send illegitimate messages towards the SIP server. This results in a high volume of SIP messages that originate from a large number of sources. It is not unusual for hundreds of thousands of computing devices to be infected in this way, resulting in illegitimate messages originating from a diverse set of network addresses. If illegitimate messages are received from a wide spectrum of IP addresses it may be difficult to distinguish illegitimate and legitimate traffic based on the message source. This is compounded by the design of the network stack: IP routers, IP processing software and IP processing hardware deliver packets independently of specific users and without any concept of “legitimacy” or “illegitimacy”. Typically, a packet containing a SIP message will only be deemed “legitimate” or “illegitimate” following SIP authentication, with “illegitimate” messages failing the authentication. However, by this time a SIP server has already committed resources to IP processing and initial SIP parsing and processing. A malicious party can thus disrupt the service provided by a SIP server by hijacking, and overloading, this lower level processing.
There have been efforts in the art to manage an overloaded SIP server.
U.S. Pat. No. 7,522,581 B2 describes a classification algorithm that is applied to incoming SIP messages. Following classification a SIP message may be assigned to one of a plurality of queues for SIP processing. Each queue may represent a priority. Messages in a high priority queue may be processed before messages in a low priority queue.
U.S. Pat. No. 7,869,364 B2 describes a SIP server with a plurality of server states. The SIP server may be placed in a particular state according to the value of a monitored resource metric. Received SIP messages may be sorted into “call processing” or “non-call processing” queues and handled appropriately, depending on the server state.
Both these solutions require additional processing and bespoke SIP server systems. This may make them difficult to apply to existing implementations. The additional processing may require increased lower level processing when subject to overload conditions. This makes these solutions less effective when subject to DoS or DDoS attacks. As each solution presents a compromise for allocation of limited server resources, they may also be sub-optimal for dealing with both malicious attacks and large quantities of legitimate traffic.
It is therefore desirable to provide a method and system for managing SIP messages that overcomes the problems associated with server overload and/or malicious attacks and that offers an improvement on known solutions.