Closed-captioning of television and other media programs is often provided in order for people with hearing impairments to understand the dialogue in a program. Often live broadcasts, for example, news programs, award shows, and sporting events, are captioned in real time by transcriptionists watching a feed of the program and/or listening to an audio feed for the program (such as via a telephone or voice over Internet protocol connection) which may be a period of time (such as 4-6 seconds) ahead of the actual live broadcast time. Naturally, there is a delay in the presentation of the closed caption information to a hearing-impaired viewer because of the time it takes the transcriber to type the words spoken after hearing them and because the feed utilized by the transcriptions is typically a short period of time ahead of the actual live broadcast. Presently, when such programs are streamed real time or recorded, the closed captions remain in the vertical blanking interval of the original frames in an analog signal or in the same location within the bit stream or data packet of a digital signal. Thus, upon receipt of the streamed live program and/or replay of a recording of the original live program, the closed captioning is still delayed and is not simultaneous with the actual spoken words or sounds in the program.
Fundamentally, the Internet is about text. Internet search engines (e.g., Google®) parse the text of the pages in websites and index it. When an Internet search is performed, it is this index of the text that is analyzed. Local searches on desktop computers (e.g., “Find” commands, Apple® “Spotlight” software, or Microsoft Windows “Search”) are similarly basically text searches for words or phrases in a document, file names, and metadata about a file (e.g., author, file creation date, etc.). Digitally recorded video and audio have traditionally been fairly opaque with regard to search engines, either local or Internet-based. For example, the Google® search engine cannot process recorded video in any meaningful way—only the text that surrounds the video is indexed. This indexing is thus typically limited to the title of the video, a few keywords (assuming the site uses “tagging” of some sort), and possibly the date that the recording was made. There is currently no way to conduct a deeper search of the video to identify particular content, for example, occurrences of names, places, music, events, or occurrences.
The information included in this Background section of the specification, including any references cited herein and any description or discussion thereof, is included for technical reference purposes only and is not to be regarded subject matter by which the scope of the disclosure is to be bound.