Presentation of textual information can be an important part of the video viewing experience. Text information may be used to represent the movie title, chapter names, specific track data, as well as subtitles. Subtitles may be used for conveying the dialogue of a video presentation in different languages, to aid those with hearing impairments, poor listening comprehension, or to suit the viewer's current listening preferences and at times to present the director or even user commentary in environments where such information is available.
Typically, embedding textual information such as subtitles with audio and video data into multimedia files involves run-length-encoding bitmap images of the subtitle text information. The run-length encoding of bitmaps provides an efficient way of storing the information, and since bitmaps are a pictorial representation of the text rather than a textual representation, there are no additional requirements to render the subtitles, such as utilizing embedded or resident font files. However, despite its advantages, because the text information is stored as bitmaps, it adapts poorly to changes in image frame size and as a result cannot be scaled with acceptable visual results to multiple sizes. In addition to the problem with scalability, bitmap representations are not easily searchable as text, which is an attractive feature from the perspective of categorization, metadata and archival activities.
The use of the actual text with respect to a known alphabet of a particular language to represent the textual information, e.g., a title, chapter names, and/or a dialogue in a movie, is one alternative to using bitmaps to represent the information. Using text in a movie typically requires the encoding of the text in a commonly acceptable representation. ASCII and Unicode are two such representations, where ASCII is typically used for encoding European languages and allows a maximum of 256 symbols, and Unicode is used for representing over 100,000 characters and other symbols from a very comprehensive list of world languages.
Font files are electronic data files containing a set of representations for displaying characters or symbols. There are cases where the font for playing back subtitles may already be present on a playback device. However, often when it comes to foreign languages, or when the look-and-feel of the font is of importance from artistic, aesthetic, or readability purposes, a specific font, which has been tailored to the multimedia presentation, is provided with the multimedia content for use by the playback device. The representations or glyphs in a font file may be made in the form of individual bitmaps, drawing instructions for mathematical formulas for specifying the outline of a character, or instructions on drawing a series of lines with specific sizes and shapes. The advantage of specifying the glyph shape using drawing instructions and mathematical formulas is that the size of the character representation may be scaled to different sizes while generally maintaining their intended shape of the character. A font may also be composed of a mixture of bitmap and non-bitmap glyphs, where bitmap specifications could be used for only depicting certain character sizes.
Some languages, such as Chinese, Japanese, and Korean, utilize a unique symbol for representing each individual word in their corresponding vocabularies. The large number of words making up these different languages leads to very large font files as compared to languages which utilize combination of letters in a unique alphabet to form words. Font files for these languages may be in the 1 to 10 Mbytes range, and can sometimes be as large as 30 Mbytes. While the size of these fonts may not be an issue for processing on a personal computer, in an embedded or a consumer electronic device, such large font sizes may pose a problem especially if the fonts are expected to be dynamically available in memory for the device's rendering engine. When the size of the required representation file exceeds the resource handling capability of an embedded device, the behavior exhibited by these devices and the resulting user-experience may be non-uniform.
The term streaming media describes the playback of media on a playback device, where the media is stored on a server and continuously sent to the playback device over a network during playback. Typically, the playback device stores a sufficient quantity of media in a buffer at any given time during playback to prevent disruption of playback due to the playback device completing playback of all the buffered media prior to receipt of the next portion of media. Adaptive bit rate streaming or adaptive streaming involves detecting the present streaming conditions (e.g. the user's network bandwidth and CPU capacity) in real time and adjusting the quality of the streamed media accordingly. Typically, the source media is encoded at multiple bit rates and the playback device or client switches between streaming the different encodings depending on available resources.
Adaptive streaming solutions typically utilize either Hypertext Transfer Protocol (HTTP), published by the Internet Engineering Task Force and the World Wide Web Consortium as RFC 2616, or Real Time Streaming Protocol (RTSP), published by the Internet Engineering Task Force as RFC 2326, to stream media between a server and a playback device. HTTP is a stateless protocol that enables a playback device to request a byte range within a file. HTTP is described as stateless, because the server is not required to record information concerning the state of the playback device requesting information or the byte ranges requested by the playback device in order to respond to requests received from the playback device. RTSP is a network control protocol used to control streaming media servers. Playback devices issue control commands, such as “play” and “pause”, to the server streaming the media to control the playback of media files. When RTSP is utilized, the media server records the state of each client device and determines the media to stream based upon the instructions received from the client devices and the client's state.
In adaptive streaming systems, the source media is typically stored on a media server as a top level index file pointing to a number of alternate streams that contain the actual video and audio data. Each stream is typically stored in one or more container files. Different adaptive streaming solutions typically utilize different index and media containers. The Synchronized Multimedia Integration Language (SMIL) developed by the World Wide Web Consortium is utilized to create indexes in several adaptive streaming solutions including IIS Smooth Streaming developed by Microsoft Corporation of Redmond, Wash., and Flash Dynamic Streaming developed by Adobe Systems Incorporated of San Jose, Calif. HTTP Adaptive Bitrate Streaming developed by Apple Computer Incorporated of Cupertino, Calif. implements index files using an extended M3U playlist file (.M3U8), which is a text file containing a list of URIs that typically identify a media container file. The most commonly used media container formats are the MP4 container format specified in MPEG-4 Part 14 (i.e. ISO/IEC 14496-14) and the MPEG transport stream (TS) container specified in MPEG-2 Part 1 (i.e. ISO/IEC Standard 13818-1). The MP4 container format is utilized in IIS Smooth Streaming and Flash Dynamic Streaming. The TS container is used in HTTP Adaptive Bitrate Streaming.
The Matroska container is a media container developed as an open standard project by the Matroska non-profit organization of Aussonne, France. The Matroska container is based upon Extensible Binary Meta Language (EBML), which is a binary derivative of the Extensible Markup Language (XML). Decoding of the Matroska container is supported by many consumer electronics (CE) devices. The DivX Plus file format developed by DivX, LLC of San Diego, Calif. utilizes an extension of the Matroska container format (i.e. is based upon the Matroska container format, but includes elements that are not specified within the Matroska format).