There are many people that have access to multimedia but are unable to fully appreciate the content delivered by the multimedia due to impairments to of one of their senses, such as sight or hearing. For example, a person that is hearing impaired may have access to a multimedia content stream having both video and audio aspects, yet can only perceive the video aspect of the content. Likewise, visually impaired persons may have access to a multimedia content stream having both video and audio aspects, yet can only perceive the audio aspects of the content.
Captioning is a supplemental media stream useful in assisting those that are hearing impaired, and has been an industry staple for television programs for many years. The notion of captioning has expanded beyond television broadcasts to other platforms, in particular to computer delivered multimedia streams. Companies such as Apple and RealNetworks have products that enable content authors to include supplemental captioning streams with end products using proprietary captioning formats. These formats are generally based on W3C's Synchronized Multimedia Integration Language.
One key drawback of current systems for integrating supplemental media streams with main multimedia streams is that the information needed to adequately supplement a main media stream often cannot keep apace with the main media stream. Due to the nature of the supplemental media streams, presenting all of the information necessary to adequately supplement the main media stream requires more time than the main media stream allows. As a consequence, content authors must selectively omit some information that would otherwise be included in the supplemental stream, in order to keep apace with the main media stream.
As an example of the foregoing problem, captioning typically displays, in textual format, the dialog of the characters on the screen. Sounds of events that occur off-screen that influence the speech of those characters, such as a scream or sounds of a collision in the background, might need to be omitted in order to capture all of the dialog of the present characters. Ideally, there would be a description of the sound that so influences the speech of those characters. But because the dialog of the characters continues, it is often necessary to omit such descriptions. In addition, supplemental media from one scene often “spills over” into the following scene, creating some confusion as to what is currently occurring in the main media stream.
What is needed is a system for supplying supplemental media streams along with a main media stream and selectively pausing the main media stream to permit the supplemental media stream to fully present the content associated with a particular event. Additionally, the system should suspend the timing model of the main media stream upon which the supplemental media streams are based so that the timing of the supplemental media streams are not affected by pauses in the main media stream.