In live streaming, the speed for displaying the first image on the screen may directly affect user experience. Taking on-demand streaming as an example, on-demand streaming user always cuts in from 0 second. In any on-demand streaming media file, the first audio-video frame at 0 second must be a key frame. Thus, for on-demand services, regardless of network transmission and decoding factors, the speed for displaying the first image on the screen may often be treated as real-time display.
However, for live streaming, a live media file may often be a streaming media file. A live streaming user may access the streaming media file at any random time. Thus, a first audio-video frame when the user cuts in may be an intra-coding frame (I-frame) also known as key frame, an inter-frame prediction coding frame (P-frame), or a bidirectional prediction coding frame (B-frame). When a player starts to play, the player needs a key frame to play properly. Thus, the following problems may exist.
1) When the player starts to play, very likely a dark screen is displayed. Moreover, when group of pictures (GOP) for the live streaming media file is large, the gap between the first audio-video frame that the player receives and the next key frame is relatively long, and hence the user may have to wait for a long time to be able to see the first image.
2) When the server transmits data starting from the last key frame, extra delay to the live streaming may be introduced. Thus, user experience may be degraded.