A media content provider or distributor may deliver media contents to various client devices such as televisions, notebook computers, and mobile handsets. The media content provider may support a plurality of media encoder and/or decoders (codecs), media players, video frame rates, spatial resolutions, bit-rates, video formats, or combinations thereof. A media content may be converted from a source or original representation to various other representations to suit the different user devices.
A media content may comprise a media presentation description (MPD) and a plurality of segments. The MPD may be an extensible markup language (XML) file describing the media content, such as its various representations, uniform resource locator (URL) addresses, and other characteristics. For example, the media content may comprise several media components (e.g. audio, video, and text), each of which may have different characteristics that are specified in the MPD. Each media component comprises a plurality of segments containing the parts of actual media content, and the segments may be stored collectively in a single file or individually in multiple files. Each segment may contain a pre-defined byte size (e.g., 1,000 bytes) or an interval of playback time (e.g., 2 or 5 seconds) of the media content.
Depending on the application, the media content may be divided into various hierarchies. For example, the media content may comprise multiple periods, where a period is a time interval relatively longer than a segment. For instance, a television program may be divided into several 5-minute-long program periods, which are separated by several 2-minute-long commercial periods. Further, a period may comprise one or multiple adaptation sets (AS). An AS may provide information about one or multiple media components and its/their various encoded representations. For instance, an AS may contain different bit-rates of a video component of the media content, while another AS may contain different bit-rates of an audio component of the same media content. A representation may be an encoded alternative of a media component, varying from other representations by bit-rate, resolution, number of channels, or other characteristics, or combinations thereof. Each representation comprises multiple segments, which are media content chunks in a temporal sequence. Moreover, sometimes to enable downloading a segment in multiple parts, sub-segments may be used each having a specific duration and/or byte size. One skilled in the art will understand the various hierarchies that can be used to deliver a media content.
In adaptive streaming, when a media content is delivered to a client or user device, the user device may select appropriate segments dynamically based on a variety of factors, such as network conditions, device capability, and user choice. Adaptive streaming may include various technologies or standards implemented or being developed, such as Dynamic Adaptive Streaming over Hypertext Transfer Protocol (HTTP) (DASH), HTTP Live Streaming (HLS), or Internet Information Services (IIS) Smooth Streaming. For example, the user device may select a segment with the highest quality (e.g., resolution or bit-rate) possible that can be downloaded in time for playback without causing stalling or rebuffering events in the playback. Thus, the user device may seamlessly adapt its media content playback to changing network conditions.
Another way of content delivery involves downloading, which may be used to partially meet personalized requirements of subscribers. DASH may allow a subscriber to have a better experience. For example, the subscriber's device can retrieve media content with high quality when the network connectivity is fast, and switch to media content with low quality to continue playing content when the network connectivity worsens. In existing schemes, segments for a media content may be thrown away after they are retrieved and decoded by a client device. Accordingly, the segments may not be effectively used for future purposes. For example, if the client device decides to replay the media content, streaming may need to start from scratch, which may waste network resources.