As directed by Congress in the Telecommunications Act of 1996, the FCC adopted rules requiring closed captioning of all television programming by 2010. The rules became effective Jan. 1, 1998. Closed captioning is designed to provide access to television for persons who are deaf and hard of hearing. It is similar to subtitles in that it displays the audio portion of a television signal as printed words on the television screen. Unlike subtitles, however, closed captioning is hidden as encoded data transmitted within the television signal, and provides information about background noise and sound effects. A viewer wishing to see closed captions must use a set-top decoder or a television with built-in decoder circuitry. Since July 1993, all television sets sold in the U.S. with screens thirteen inches or larger have had built-in decoder circuitry.
The rules require companies that distribute television programs directly to home viewers (“video program distributors”) to provide closed captioned programs. Video program distributors include local broadcast television stations, satellite television services, local cable television operators, and other companies that distribute video programming directly to the home. In some situations, video program distributors are responsible for captioning programs.
Beginning Jan. 1, 2000, the four major national broadcast networks (ABC, NBC, CBS, and Fox) and television stations in the top 25 television markets (as defined by Nielsen) that are affiliated with the major networks are not permitted to count electronic newsroom captioned programming towards compliance with their captioning requirements. Electronic newsroom captioning technology creates captions from a news script computer or teleprompter and is commonly used for live newscasts. Only material that is scripted can be captioned using this technology. Therefore, live field reports, breaking news, and sports and weather updates are typically not captioned. Impromptu, unscripted interaction among newsroom staff is also not captioned. Because of these limitations, the FCC decided to restrict the use of electronic newsroom captioning as a substitute for real-time captioning. This rule also applies to national non-broadcast networks (such as CNN®, HBO®, and other networks transmitting programs over cable or through satellite services) serving at least 50% of the total number of households subscribing to video programming services.
These requirements and restrictions force local and national programmers to provide “live” or “real-time” closed captioning services. Typically, real-time captions are performed by stenocaptioners, who are court reporters with special training. They use a special keyboard (called a “steno keyboard” or “shorthand machine”) to transcribe what they hear as they hear it. Unlike a traditional “QWERTY” keyboard, a steno keyboard allows more than one key to be pressed at a time. The basic concept behind machine shorthand is phonetic, where combinations of keys represent sounds, but the actual theory used is much more complex than straight phonics. Stenocaptioners need to be able to write real time at speeds well in excess of 225 words per minute, with a total error rate (TER) of under 1.5%. The steno then goes into a computer system where it is translated into text and commands. Captioning software on the computer formats the steno stream of text into captions and sends it to a caption encoder. The text stream may be sent directly to the computer or over the telephone using a modem.
There is no governing body for stenocaptioners. Many have credentials assigned by the state board overseeing court reporters, the National Court Reporters Association (NCRA) for machine shorthand writers, or the National Verbatim Reporters Association (NVRA) for mask reporters using speech recognition systems. Rates for stenocaptioners services range from tens of dollars per hour to hundreds of dollars per hour. The cost to networks or television stations to provide “live” or “real-time” closed captioning services therefore, vary greatly but are expensive because the process is labor intensive.
There are real-time speech recognition systems available for “mask reporters,” people who repeat everything they hear into a microphone embedded in a face mask, and inserting speaker identification and punctuation. However, these mask reporting systems are also labor intensive and are unlikely to significantly reduce the cost of providing “live” or “real-time” closed captioning services.
The FCC mandated captioning requirements and the rules restricting the use of electronic newsroom captioning as a substitute for real-time captioning for national non-broadcast and broadcast networks and major market television stations creates a major expense for each of these entities. Because they are required to rely on specially trained stenocaptioners or mask reporters as well as special software and computers, program creators may be required to spend tens to hundreds of thousands of dollars per year. Therefore, there is a need for a system and method that provides “live” or “real-time” closed caption data services at a much lower cost.