In the case of adopting traditional streaming media technology, transmitted streaming media data needs to go through a firewall, and a professional media server is required to support streaming media technology. In addition, implementation of the traditional streaming media technology is relatively complicated. At present, internet streaming media technology that transmits streaming media data over internet has emerged, which does not propose additional requirement for the existing internet system, but amends storage and information description manners of media files so that streaming media data is transmitted through existing HTTP protocol.
Dynamic adaptive streaming over HTTP (DASH) standard formulated by the moving pictures experts group (MPEG), called MPEG DASH standard for short, provides a standardization scheme of adopting the internet streaming media technology to transmit streaming media data.
Hierarchy structure model of media presentation description defined by MPEG DASH is as shown in FIG. 1.
In the hierarchy structure model, period is used to describe media content that can be played for a period of time, and media content described by periods of time whose sequence are adjacent is continuous in time. One period includes a plurality of adaptation sets, each adaptation set describes media content adaptive to a plurality of code rates, and each code rate is corresponding to one representation. Representation describes information of media content such as specific encapsulation format, code rate and encoding/decoding parameters and so on. One representation includes uniform resource locators (URL) of a plurality of segments, where an uniform resource locators (URL) is used to indicate storage location of a corresponding segment. The segment includes specific media content, i.e., audio, video, caption and multiplexed audio and video and so on.
An optional solution of transmitting streaming media data over Internet based on the above discussed MPEG DASH standard is: firstly establishing a WebSocket two-way connection, transmitting control information of the MPEG DASH standard through WebSocket text frames, and transmitting segments through WebSocket binary frames.
A process of a client's representation of media content under a framework as shown in FIG. 1 is shown in FIG. 2. The process of the client's representation of media content under a framework as shown in FIG. 1 comprises following steps.
In step S201, the client transmits an OpenMedia message to the server. The client indicates media needed to be played to the server by carrying an URL address of the MPD in the message.
In step S202, the server transmits a MediaInfo message to the client, to inform information of media (such as the media is ready) to the client.
In step S203, the client transmits a StartStream message to the server, to request the server to distribute media.
In step S204, the server transmits a StreamInfo message to the client, to inform information of media stream to the client.
In step S205, the server starts to distribute media segments.
After the client requests the server to start playing media, the server can carry, in the StreamInfo message transmitted to the client, indication information multipleRepresentation used to indicate whether the server would distribute a plurality of representations with same content and different code rates in step S204 in FIG. 2. If this indication information indicates that the server would distribute a plurality of representations with same content and different code rates, then the server would transmit to the client a plurality of segments with same content and different code rates (i.e., segments with same content and different code rates included in the plurality of representations) when transmitting segments.
The OpenMedia message, MediaInfo message, StartStream message and StreamInfo message in the above process can be considered as messages extended over the WebSocket protocol with respect to the MPEG DASH standard and transmitted through the WebSocket text frames.
However, the client does not know the segment transmitted first by the server is a segment with a high code rate or a segment with a low code rate.
A known implementation comprises: after receiving one segment, the client determines within a preset waiting period of time whether a segment with a higher code rate is received; if the segment with a higher code rate is received, then the segment with a higher code rate is decoded and represented; if no segment with a higher code rate is received, then the segment received previously is decoded and represented.
For the case that the server transmits the segment with a high code rate first, in the above implementation, after receiving the segment with a high code rate, the client would still wait for the preset waiting period of time because the client does not know whether the segment with same content and different code rate transmitted by the server subsequently is a segment with a higher code rate or a segment with a lower code rate, which results in an increase of time delay of decoding at the client.