Real-time applications that transport media frames over the Internet almost always use the Real-Time Protocol (RTP), [1]. The media frames are encapsulate into RTP packets and transmitted to the other user(s). The RTP protocol also defines a control protocol, the Real-Time Control Protocol (RTCP). RTCP provides functionality that is related to the media flow, for example:                Quality feedback, for example packet loss rate and inter-arrival jitter. The feedback may be used to adapt the transmission of the media.        Information needed for synchronization between different media, for example to achieve and maintain lip-sync between speech and video.        Source Description (SDES), which identifies the media sender.        Application-specific signaling (APP), which can be any kind of signaling and does not need to be standardized. RTCP is constructed in such a way that receivers that don't understand the RTCP-APP packet can discard it and continue parsing the remaining part(s) of the RTCP packet.        
Sending RTCP packets requires some bandwidth. This bandwidth is regarded as ‘overhead’ since these packets are sent in addition to the media packets. To ensure that the overhead is kept small enough, in relation to the media bandwidth, a limit for the RTCP bandwidth is often defined at session setup. A normal definition is to allow 2.5% to 5% of the bandwidth to be used for RTCP.
Traditional RTCP packets, so called compound RTCP packets, are fairly large because it is required that an RTCP packet must contain either a Sender Report or a Receiver Report as well as SDES, even if the intention is to send only the APP packet. This means that such packets can only be transmitted quite infrequently, often less than once per second. A solution is, however, being discussed in IETF (Internet Engineering Task Force), see [2], where one allows sending of only APP packets in so called non-compound RTCP packets. With non-compound RTCP packets, the packet size is significantly reduced.
For Multimedia Telephony, it has been decided that RTCP-APP packets shall be used for sending adaptation signaling for voice. The adaptation signaling may suggest: changing the codec mode; packing more or fewer frames into one packet; or adding or removing application layer redundancy, possibly with an offset.
RTCP packets are often quite large, significantly larger than normal voice packets. For example, one VoIP (Voice over IP) packet with AMR122 (AMR=Adaptive Multi-Rate) encoded media is about 72 bytes without header compression and typically about 35 bytes when header compression is applied. Compound RTCP packets are typically in the order of 100-140 bytes without header compression and 80-120 bytes with header compression. Non-compound RTCP packets are significantly smaller. A non-compound RTCP packet with only APP is in the order of 50 bytes without header compression and about 30 bytes with header compression.