Multimedia streaming is a major application in the internet and for 3G networks, and a packet-switched streaming service has been standardized in 3GPP allowing to stream audio and video data to handhelds. In another context, streaming is used for the realization of mobile TV services.
Within a packet-switched streaming service, a streaming server transmits packets over a network to a streaming client. Another example of transmitting packets over a network such as the Internet is progressive download. Often these packets also pass intermediate nodes like proxies. Adaptive streaming enables a streaming service to adapt to varying network conditions. This is necessary because, for a continuous play out of a media stream, the transport network has to provide throughput which is at least as high as the rate of the encoded content. Although best effort networks can often provide the required bit rate, they cannot guarantee availability of the required bit rate during the whole lifetime of the session. In particular, mobile links are often characterized by a varying throughput due to the nature of the wireless channel.
In case the transport network can only provide a bit rate that is lower than the content rate, i.e. in the case of insufficient bandwidth, not all packets of the packet-switched video data can be transmitted. In this case, some packets have to be dropped. This dropping of packets, also called thinning, can be done either at the server or at the proxy.
Often, the selection of the dropped packets is done randomly. In this case, large degradations of the stream media quality can be expected, since it is likely that packets with large impact on the overall media quality will be dropped.
Another possibility to select a packet to be dropped is based on the importance of a packet towards the overall media quality.
As the frames are normally compressed using video compression frames of the video stream with a varying size of compressed data are known. A frame is essentially a picture captured at a predetermined instant in time, the set of frames building the video stream. In typical video coding schemes, such as an MPEG coded video stream, a GOP (Group of Pictures) is a group of successive pictures within the video stream. Each MPEG coded video stream consists of successive GOPs. A GOP can contain the following frame types:                I frame (intra-coded frame): a frame corresponding to a fixed image which is independent of other frames. Each GOP begins with this type of frame.        P frame (predictive coded frame): contains motion compensated difference information relative to previously coded frames, using up to 1 reference frame for prediction. Normally, P frames need much less storing space than I frames.        B frame (bidirectional predictive coded frame): contains motion compensated difference information relative to previously coded frames, using up to 2 reference frames for prediction. Normally, B frames need less storing space than I frames or P frames.        
One way of the selection of the packets to be dropped is based on the importance of the packet towards the overall media quality. In case of video encoded using sequential prediction structures (e.g. a structure such as IPPP) or structures with non-referenced B frames (IBP or IBBP), a packet belonging to an I frame is more important than a packet belong to a P frame or even a B frame. Furthermore, the importance of a P frame depends on its position in the GOP. By way of example, a P frame just before an I frame is less important than another one appearing earlier in the GOP. For hierarchical prediction structures (hierarchical B frames) the importance of a video frame increases with the number of pictures that depend on that frame. Exploiting such knowledge for video thinning at a server or proxy is only possible if the packet's corresponding frame type or position within a GOP is known. This knowledge has either to be stored at the dropping node or has to be extracted from the stream by checking the payload.
However, this knowledge is often not available, especially in case the proxy is carrying out the frame dropping. An a priori knowledge of the stream is not available. In case of an encrypted video, a check of the payload is not possible. In the case of a non-encrypted video, a payload check is often not feasible since the complexity of the proxy and processing time is increasing too much.