Real-time Transport Protocol (RTP) is a format for delivering audio and video media data over a packet switched network. RTP is used for transporting real-time media data, such as interactive audio and video. It is therefore used in applications such is IPTV, conferencing, Voice over IP (VoIP).
Secure Real-time Transport Protocol (SRTP), specified in IETF RFC 3711, is a transport security protocol that provides a form of encrypted RTP. In addition to encryption, it provides message authentication and integrity, and replay protection, in unicast, multicast and broadcast applications. SRTP is used to protect content delivered between peers in an RTP session. SRTP is a transport security protocol and it is only intended to protect data during transport between two peers running SRTP. In particular, it does not protect data once it has been delivered to the endpoint of the SRTP session. In addition, the sending peer provides the protection by way of encryption of the media data, in other words it is assumed that the sending peer has knowledge of all keying material and is the one applying the protection of the data.
RTP is closely related to RTCP (RTP control protocol), which can be used to control the RTP session, and similarly SRTP has a sister protocol, called Secure RTCP (or SRTCP). SRTCP provides the same security-related features to RTCP as the ones provided by SRTP to RTP.
Utilization of SRTP or SRTCP is optional to utilization of RTP or RTCP; but even if SRTP/SRTCP are used, all provided features (such as encryption and authentication) are optional and can be separately enabled or disabled. The only exception is the message authentication feature, which is indispensably required when using SRTCP.
Key management for SRTP and SRTPC may be performed independently of each other, and so different encryption material may be used for each protocol. The confidentiality protection in both SRTP and SRTCP applies to the signalling payload, whereas the integrity protection covers both the payload and the metadata contained in each packet header.
Many content delivery systems are based on store and forward mechanisms, and require end-to-end confidentiality protection of media even where an intermediate node handles the data. Two typical examples are:
1. A networked media server for prime content, which requires end-to-end media protection. The media server allows the user to perform fast forward and rewind operations on the media stream.
2 A network telephone answering machine that supports end-to-end protection.
For the above scenarios, it is impossible to use SRTP for media protection, because of the design of the protection mechanisms. The confidentiality and integrity protection mechanisms (such as the formation of the Initialization Vector, IV, which is a block of bits used in conjunction with keying materials to prevent a data unit that is identical to a previous data unit from producing the same ciphertext when encrypted.) depend on certain parameters in the RTP packet header. This means that at least these values have to remain the same if a client should be able to decrypt and check integrity protection when forwarding a SRTP stream. Furthermore, if integrity protection is applied it is impossible to change even a single bit in headers and payloads when resending them from an intermediate node. This means that SRTP has to be resent by the intermediate node exactly as received, which would make fast forward and rewind operations impossible as the RTP sequence must have a monotonically increasing sequence number, SEQ, for each RTP packet.