Consumers have an ever-increasing array of options for consuming media content, in terms of the types of media content (e.g., video, audio, etc.), providers of the media content, and devices for consuming the media content. Media content providers are becoming increasingly sophisticated and effective at providing media content quickly and reliably to consumers.
Many client devices that consume online content employ an adaptive bitrate streaming technique to request successive fragments of the content for decoding, rendering, and display. Manifest data are provided to the client to provide the client with the information it needs to generate properly formatted requests for the audio, video, and subtitle fragments of the live streaming content. The manifest data typically includes multiple options for video and audio streams, each including video and audio fragment at different resolutions, quality levels, bitrates, languages, etc. The manifest data also includes presentation time data, such as timestamps, for presenting the fragments according to a media timeline. In some scenarios, the presentation time data may indicate that two fragments for a particular playback option (e.g., two adjacent video fragments for a particular resolution, quality, bit rate and resolution) overlap in time.
For example, live streaming content includes primary content such as scheduled content (e.g., premium movie channels) or live broadcasts (e.g., live sporting events, live concerts, etc.). Live streaming content often includes segments of secondary content (e.g., advertisements) that are dynamically inserted within the primary content. The secondary content is typically inserted in the place of so-called slates that are inserted (often manually and in real time) as placeholders in the primary content. For example, a slate might be inserted at the source of the live content (e.g., at a football stadium video capture booth) by an operator pushing a button when the slate should begin and releasing or pressing the button again when the slate should end based on what is happening in real time at the event being broadcast (e.g., during a timeout on the field). Given the arbitrary nature of slate insertion, and that secondary content (e.g., ads) inserted in such slate periods originates from other sources (e.g., an ad exchange), the inserted secondary content may be longer than the slate it replaces.
For instance, presentation time data in manifest data may indicate a temporal overlap of 0.2 seconds corresponding to the scenario where fragments for secondary content with a duration of 15.2 seconds are being inserted for a slate with a duration of 15 seconds. In some devices, the manifest data can be used to play back the streaming content by either truncating playback of the last fragment of secondary content by 0.2 seconds, or by skipping the first 0.2 seconds of the first fragment of the resumed primary content (i.e., offsetting the start of playback of a fragment by skipping an initial portion of the fragment). However, certain devices do not support fragment playback involving truncation or offset starts because of limitations in rendering hardware and/or firmware. If the device is not equipped to handle these overlaps at the transitions between primary and secondary content, playback may be degraded, such as the synchronization between audio and video being lost, or other undefined behavior including silence, black frames, and/or corrupted frames. The degradation may become amplified over time as the effects of such overlaps accumulate.
Some devices that lack the ability to decode-and-drop (i.e., to drop frames that have been decoded but will not be displayed) can truncate playback of the end of a fragment by flushing the renderer stack. However, in addition to not addressing the inability to perform offset starts, (i.e., skip the beginning portion of a fragment) this is not the optimal solution because it can increase the chance of re-buffers as the renderer stacks are replenished. It also causes the media player to drift away from the live playhead of the primary content because of the additional time it takes to fill the renderer buffer after a flush, which may cause a visible black period and/or spinner indicating a pause in media content playback. If the playback of the media player, i.e., the client playhead, lags too far behind the live playhead, this can result in a negative viewer experience. Another approach avoids the need to handle such discontinuities by using two media players; one to handle playback of the primary content, and one to handle playback of the secondary content, and switching between the two players. However, running two media players can be wasteful of processing resources and may be characterized by unacceptable latencies when switching between the two players. Furthermore, certain devices only have hardware for one video decoder pipeline, therefore implementations requiring two media players are not an option for such devices.
In older or lower cost devices that do not support truncating fragment playback, such as by dropping a portion of the audio or video fragment, the effect of such overlaps can be degraded playback in the form of buffer flushing, loss of audio/video synchronization, and/or excessive increases in the playhead latency (e.g., the delay between the live playhead of the video content and a client playhead associated with the playback of the video content on a client device).