In recent years, Internet video services have developed rapidly, and traffic of video content has accounted for half of the entire Internet traffic. Speaking of the Internet video services, one has to mention a streaming media technology. It is the continuous development of the streaming media technology that fosters the rapid development of the current Internet video services. The current streaming media technology is mainly classified into two types: One is a connection-oriented streaming media technology represented by RTSP/RTP (Real Time Streaming Protocol/Real Time Transport Protocol); the other is a connectionless streaming media technology of HTTP (Hyper Text Transfer Protocol) progressive download that is currently used by mainstream video websites.
The RTSP/RTP streaming media technology is a peer-to-peer download technology based on a multicast application layer protocol, where RTP is used for transmitting streaming media data, and RTSP is used for collecting statistics on, managing, and controlling RTP transmission. The two work together and can significantly improve transmission efficiency of real-time network data. However, the RTSP/RTP streaming technology contains certain defects
Logical implementation of the RTSP/RTP protocol stack is relatively complicated. Compared with the HTTP technology, it is relatively more difficult to implement hardware and software of a terminal that supports RTSP/RTP, which is especially obvious in an embedded terminal; in addition, a network port number (554) used in the RTSP protocol may be blocked by a firewall, NAT, and the like in some users' networks, and therefore cannot be used. Although RTSP may be configured, in a tunnel manner on some streaming servers, on HTTP port 80 for bearing, actual deployment is inconvenient.
The streaming media technology of HTTP progressive download means: An HTTP terminal may start to play streaming media data before an entire streaming media file is completely downloaded, and if both the HTTP terminal and a streaming media server support HTTP 1.1, the HTTP terminal may further select any time point in a part, which is not completely downloaded, to start media playback. Currently, mainstream video websites implement streaming media delivery in the HTTP progressive download manner.
Compared with the RTSP/RTP technology, the streaming media technology of HTTP progressive download uses the stateless HTTP protocol. When an HTTP terminal requests streaming media data from a streaming media server, the streaming media server delivers the requested streaming media data to the terminal; however, the streaming media server does not record a state of the terminal, and each request of the HTTP terminal is an independent one-time session.
As a simplest and original streaming media solution, a remarkable advantage of the HTTP progressive download solution is that only a Web server of one standard needs to be maintained, and installation and maintenance of the Web server are much easier and simpler than those of a dedicated streaming server in terms of workloads and complexity. However, disadvantages and defects are also obvious. Firstly, bandwidth is easy to be wasted. When an HTTP terminal plays content while downloading streaming media data from a streaming media server, if a user of the terminal chooses to stop watching before the content playing is completed, streaming media data that has been downloaded is a waste of a bandwidth resource. Secondly, HTTP-based progressive download is applied only to on-demand content and does not support live content.
In view of that, an HTTP Adaptive Streaming (hereinafter referred to as “HAS”) technology that combines the RTSP/RTP streaming media technology and the streaming media technology of HTTP progressive download emerges accordingly. The HAS technology can greatly improve users' media playback experience while reducing technical complexity of a streaming media server; in addition, an HTTP-based transmission manner increases a penetrating capability of streaming media data in a network device. Currently, the HAS technology has become a development trend of the streaming media video industry.
A key of the HAS technology is to partition a streaming media file into segments, where each segment has a same time length, which is approximately 10 seconds. At a video coding layer, this means that each segment includes several complete video GOPs, and each segment has one key I frame, so as to ensure independence of each segment.
Segments may be separately encoded according to different bit rates, and segments with multiple bit rates are obtained. That is, a streaming media server locally stores streaming media files encoded at different bit rates, and streaming media files with same content may be encoded to obtain a streaming media file with a bit rate of 128 kbps, a streaming media file with a bit rate of 256 kbps, a streaming media file with a bit rate of 512 kbps, and the like. The streaming media server further provides an index file, where related information about the streaming media files with different bit rates is recorded in the index file. After downloading the index file from the streaming media server, a terminal requests, according to the information recorded in the index file, to download and play a streaming media file with a lowest bit rate. If a segment of the streaming media file with the lowest bit rate can be successfully downloaded and played, it indicates that a current terminal capability and a current network status can support the lowest bit rate, and the terminal attempts to request a streaming media file with a higher bit rate; if a segment of the streaming media file with the higher bit rate can also be successfully downloaded and played, the terminal continues to attempt to download a streaming media file with a still higher bit rate; if a segment of the streaming media file with the still higher bit rate cannot be successfully downloaded and played, the terminal continues to download and play the streaming media file with the lowest bit rate, and so on, until the terminal stabilizes the bit rate at a suitable one for downloading and playing a streaming media file.
An HAS technology based on HTTP Live Streaming (HTTP Live Streaming, HLS) is used as an example. A complete streaming media file is partitioned into multiple HTTP-based segments. When starting a streaming media session, a terminal first downloads, from a streaming media server, an extended M3U playlist file (that is, an index file, hereinafter referred to as a playlist file for short) that includes metadata. In HLS specifications, the playlist file is described as follows:
A playlist file is a text file that includes multiple individual lines, where each line is distinguished by a carriage return character or a line feed character, and each line records a URI and corresponding attribute information of a segment at a bit rate, where the attribute information includes:
BANDWIDTH: bandwidth, mandatory parameter that indicates bandwidth required for segment transmission at the bit rate;
PROGRAM-ID: this value is a decimal integer that uniquely identifies a particular description within the scope of the PlayList file;
CODECS: decoder information, optional parameter;
RESOLUTION: resolution, indicating resolution required for playing a segment at the bit rate on a terminal;
AUDIO: audio information, which is required to match a value of the “GROUP-ID” attribute in an “EXT-X-MEDIA” tag of an AUDIO type, and indicates audio information required for playing a segment at the bit rate on a terminal; and
VIDEO: video information, which is required to match a value of the “GROUP-ID” attribute in an “EXT-X-MEDIA” tag of a VIDEO type, and indicates video information required for playing a segment at the bit rate on a terminal.
After downloading the playlist file from the streaming media server, the terminal downloads and plays, according to URIs (Uniform Resource Identifier, uniform resource identifier) that are of segments at different bit rates and are recorded in the playlist file, segments of the streaming media file in ascending order of bit rates until the terminal stabilizes the bit rate at a suitable one, and then the terminal continuously downloads and plays a streaming media file at the suitable bit rate.
In the foregoing streaming media data acquiring solution, according to the HLS specifications, it is suggested that the time length of each segment is approximately 10 seconds. When a streaming media file is being downloaded in ascending order of bit rates, the streaming media file downloaded in the first tens of seconds is streaming media data at a low bit rate. Streaming media data at a higher bit rate has richer image details, that is, better image quality; therefore, the streaming media data downloaded and played in the first tens of seconds according to the foregoing streaming media data acquiring solution has poor quality.