In order to provide media content, such as content including video and/or audio, to the aurally or visually impaired, various techniques have been conventionally used. In this regard, video content may include pictures, motion pictures, animation, and other content, and audio content may include dialogue, music, sound effects, and other content. Such content may be difficult to provide to the aurally or visually impaired in a manner in which the user can practically perceive the provided information of the content.
For the aurally-impaired, video content is conventionally supplemented with closed captioning in order to provide relatively complete media content to such users. Closed captioning provides on-screen text representing the audio content with varying specificity. For example, closed captioning can represent one or more of dialogue, music, sound effects, and other components of audio content. As shown in FIG. 1, a conventional closed captioning system may include a stenographer 3, who listens to either a live or pre-recorded production 1. While listening, the stenographer 3 creates a closed captioning stream 5 based on the audio content.
Two types of closed captioning are paint-on closed captioning and pop-on closed captioning. Paint-on closed captioning is the more frequently implemented type of closed-captioning. In paint-on closed captioning, a stenographer 3 listens to a production and records, in real-time, the closed captioning stream 5, e.g., during a live broadcast of a television program or sporting event. As the stenographer 3 records the closed captioning stream 5, the stream is “painted-on” to a user's display device. As a result of the real-time creation of paint-on closed captioning, users generally experience a delay between receiving the video content and receiving the paint-on closed captioning stream which corresponds to the audio content associated with the received video.
In contrast, pop-on closed captioning typically includes the closed captioning stream 5 and an associated command that directs a display device to “pop-on” the closed captioning stream 5 at the appropriate time. Thus, pop-on closed captioning requires some programming prior to distribution to users in order to record and accurately time the “pop-on” commands. As a result of this additional programming, pop-on closed captioning may be synchronously displayed with the corresponding video content.
With either type of closed captioning, the closed captioning stream 5 may be distributed 7 to users by closed captioning systems, such as, for example, CEA-608 or CEA-708. When viewing a particular program at a viewing location 8, the program may be displayed together with the on-screen closed captioning 9. In this manner, aurally-impaired users may be able to perceive information provided in an audio portion of media content.
For the visually-impaired, audio content may be supplemented with an additional audio-based description of video content in order to provide relatively complete media content to such users. The additional audio-based description may provide an audio description of varying specificity of the video content, including, for example, actions, movements, locations, scenery, and expressions. As shown in FIG. 2, a conventional audio-based video description service may include a narrator 4 who records an audio track 6 of video content while either reading a production script 2 or viewing a live or pre-recorded production 1. The audio track 6 may then be distributed 12 to users by, for example, traditional broadcast TV, cable, satellite, telco, or recorded media such as DVD, Blu-Ray, or videocassette. When listening to a particular program at a viewing location 8, the program may then be played together with the audio track 6. In this manner, visually-impaired users may be able to perceive information provided in a video portion of media content.
However, conventional audio-based video description services are time-consuming and expensive in their production processes. For example, in order to create such audio-based video descriptions, one or more narrators must view entire productions and record audio tracks of the video content. In addition, production entities are often required to use the expensive services of professional narrators.
Moreover, the narrator recordings of audio tracks based on the production scripts or pre-recorded programs adds another step to the production process, adding additional time and cost. Nevertheless, this process is often used since it enables narrators to create audio tracks with greater detail and accuracy, since it may be possible to start, stop, delete, add, re-record, and otherwise edit all or portions of the audio tracks.
The alternative process of the narrators recording the audio tracks during live production is also often used. This process adds additional complexity to the live production process. This live process also results in audio tracks with less detail and accuracy than the alternative method, since the narrators may have only a single opportunity to record the audio tracks during live production.
Further, the distribution of the audio-based video description service consumes scarce bandwidth of an incremental audio channel. The transmission of audio content requires large bandwidth of approximately 100-400 kilobits per second. Such large bandwidth may be unavailable or, if available, may be expensive. In contrast, closed captioning is generally allocated a bandwidth of 9.6 kilobits per second, with an actual data space of approximately 1 kilobit per second. Therefore, to the extent the large bandwidth required for transmitting audio content may be unavailable, the audio-based video description service would be unavailable to visually-impaired users.