With the popularization of the Internet, streaming media services are developing rapidly, and one of the important forms, namely, a streaming media service based on the HTTP (HyperText Transfer Protocol, hypertext transfer protocol), is becoming a developing trend.
In the streaming media service based on the HTTP, contents are encoded into multiple versions with different rates according to different encoding parameters (such as resolution), which is referred to as encoding representation (representation). The encoding representation is divided into several media segments along a time direction. The media segment is a data unit of HTTP transmission, and can be accessed uniquely through a URL (Uniform Resource Locator, uniform resource locator). A client first obtains a media presentation description (Media Presentation Description, MPD) file, which is a metadata file and provides the client with information of how to access a media segment. Then, the client continuously obtains and processes media segments according to the information in the media presentation description file, to implement the streaming media service. When available bandwidth changes, the client correspondingly chooses a media segment of an encoding representation with a higher or lower rate, to adapt to the changed bandwidth.
Index information of the media segment provides metadata of the media segment. Global metadata includes: presentation start time of the media segment, a presentation duration, a time position of an indicative media segment in a media representation. Local metadata includes: a media segment duration, accessible subsegments (subsegments) in the media segment, a location of the subsegment, whether the subsegment includes a stream access point (Stream Access Point, SAP) of a media component, and a time position of the stream access point. The index information of the media segment is important to switching of the encoding representation. The client can only start decoding and processing the encoding representation from the stream access point. Therefore, a stream access point has to be found in the media segment of a new encoding representation, while downloading and translation of an old encoding representation should last until the time corresponding to the stream access point.
In design of 3GPP (3rd Generation Partnership Project, 3rd generation partnership project), the index information of the media segment is stored in a media segment index element and is a part of the media segment. The index information of the media segment is transmitted together with other contents of the media segment, which is not necessary in all cases, and may lead to unnecessary data transmission and a waste of bandwidth. The reason is that, the index information of the media segment is only required during encoding presentation switching or time seeking (seeking). In other cases, only media segments in the same encoding representation need to be requested to be downloaded in sequence, while index information of the media segments is not required.