Media streaming, such as streaming audio, video, images, text, and the like, is a popular use of the Internet. Generally, media streaming involves sending large amounts of data from a media server to a client device, such as a personal computer, a mobile device, a television, or the like. Each media stream may have many alternate media streams, such as audio alternatives for different languages, textual alternatives for closed captioning alternatives, etc. Furthermore, due to the large file size and differing network constraints of the client device, media alternatives for different bit rates may also be provided, thereby providing multiple bit rate switching for adaptive streaming. Such a technique allows the media server to provide and/or the client device to request the media fragments of the quality most suitable given the network constraints. For example, a client device connected via a broadband connection may access high quality media streams while a client device connected via a lower bandwidth connection may access lower quality media streams.
During adaptive streaming, the media stream is usually provided in chunks, or media fragments. For easy content management on the streaming media server side, storing all media fragments belonging to the same quality level audio/video alternative together as one file is one popular solution. In this solution, a text-based media description file contains separate descriptions with the time offsets of each media fragment contained in the single file, thereby allowing the use of standard HTTP servers. Including a separate text-based description for each media fragment for the media stream, however, may create a very large and unmanageable text file for media contents with reasonable durations and several video/audio (and/or other media types such as text or graphics) alternatives, which degrade the streaming performance, e.g. longer startup delay.
For example, considering one media content with 90 minutes duration, 7 video alternatives of different bit rates, 2 audio alternatives of different languages, and a media fragment size of 2 seconds each, then there are a total of (90 minutes×60 seconds/minute×7 video alternative×2 audio alternatives)/2 seconds/fragment, or 37,800 media fragments, each of which is individually textually defined in the media description file.