The session initiation protocol (SIP), which is described in RFC 3261, is a signalling protocol for setting up, managing and tearing down of voice, video and other multi-media sessions in packet based networks. SIP is designed simply to handle these aspects of communication, other protocols such as Real Time Protocol (RTP) are used for actual data transport. SIP is an application layer protocol that can run over other protocols such as User Datagram Protocol (UDP) and Transmission Control Protocol (TCP).
A SIP network is typically composed of four types of logical SIP entities, namely, User Agents (UA), Proxy Servers, Redirect Servers and Registrars.
User Agents (UA) are endpoint entities that initiate and terminate SIP sessions by exchanging requests and responses. A UA contains a User Agent Client (UAC) and a User Agent Server (UAS). A UAC is a client application that initiates SIP requests. A UAS is a server application that contacts a user when a SIP request is received and that returns a response on behalf of the user. Typical devices that have a UA function in a SIP network include PCs, IP telephones and automated answering services.
A proxy server is an intermediary entity that acts as both a server and a client for making requests on behalf of other clients. Requests are serviced either internally or by passing them on to other servers. A proxy server may receive requests and forwards them to another server (called a next-hop server), which has more precise location information about the callee. The next-hop server might be another proxy server, a UAS, or a redirect server.
A redirect server is a server that accepts a SIP request, maps the SIP address of the called party into a new address and returns it to its client, typically a proxy server. Registration servers are continually kept updated on the current locations of users.
The primary function of proxy and redirect servers is call routing, the determination of the set of servers to traverse in order to complete the call. A proxy or redirect server can use any means at its disposal to determine the ‘next-hop’ server, including executing programs and consulting databases.
The SIP protocol is a text-based protocol partly modelled on HTTP. There are two types of SIP messages, namely, requests, which are sent from clients to servers and response, which are sent from servers to clients. A request and the responses that follow it are known as a SIP transaction.
Request methods defined in the protocol include; ‘INVITE’ which is used to initiate a session or change session parameters, ‘ACK’ which is used to confirm that a session has been initiated and ‘BYE’ which is used to terminate a session.
Response messages contain numeric response codes and there are two types of responses and six classes. ‘Provisional (1xx class)’ responses are used by a server to indicate progress of SIP transactions. An example of a provisional response is the response code 180 ‘Ringing’ response. ‘Final (2xx, 3xx, 4xx, 5xx, 6xx classes)’ responses are used to terminate SIP transactions. An example of a final response is the response code 200 ‘OK’ response.
A caller establishes a call by issuing an ‘INVITE’ request. This request contains header fields used to convey information about the call. The most important header fields are ‘To’ and ‘From’, which contain the callee's and caller's SIP address, respectively. The Subject header field identifies the subject of the call.
If the callee accepts the call, it responds with an ‘OK’ response. Connection is done using a three way handshake and so the caller responds with an ‘ACK’ message to confirm receipt of the ‘OK’ response.
SIP provides for a variety of multi-media communication features similar to those provided by traditional Private Branch Exchanges, for example, call waiting, call hold, Music on Hold, and conference calling. It is envisaged that many new such features for client endpoint-to-endpoint communication in SIP networks will be developed. Communication will sometimes occur in circumstances where one of the endpoints provides a new feature that the other endpoint does not. To date, if there is a difference in the set of features available at one endpoint in a SIP network and the set of features available at the other endpoint in the network, the endpoints communicate using their lowest common feature set.