Closed captioning is an auxiliary data signal that is transmitted with a video signal indicating text that can be displayed on a display monitor. Closed captioning is most frequently used to provide a textual description of audible sounds (most notably, a transcript of spoken words) supplied with the video signal to aid hearing impaired viewers of video programs to experience an otherwise imperceptible audio portion of video programs.
Since 1992, all television sets in the United States measuring thirteen inches or more diagonally have been required to decode and to display closed captioning text on the television display screen. Furthermore, all broadcasting television equipment used in all television broadcast applications (i.e., satellite, cable television, and terrestrial broadcast) must carry the closed-captioning text end-to-end (whenever present).
The vertical blanking interval of a broadcast analog video signal can carry up two octets (bytes) of closed captioning or "extended data service ("EDS" or XDS")of data per field. The first field of each frame is used for carrying closed captioning data bytes and the second field of each frame is used for carrying XDS data bytes. (While the invention is illustrated herein for closed-captioning data, it is equally applicable to XDS data.) As such, up 120 bytes of XDS data and closed captioning text may be broadcast each second (for NTSC standard video signals).
MPEG-1 and MPEG-2 are popular standards for encoding digital video signals. See ISO.backslash.IEC 13818-1,2,3 Information Technology--Generic Coding of Moving Pictures and Associated Audio: Systems, Video and Audio, Nov. 11, 1994 ("MPEG-2")and ISO.backslash.IEC 11172-1,2,3 1993, Information Technology--Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s-Parts 1,2,3: Systems, Video and Audio ("MPEG-1"). The MPEG standards specify syntaxes and semantics for formatting compressed video and audio and for decoding and recovering the video and audio for synchronized presentation. Video encoding includes dividing a picture (field or frame) into macroblocks or 16.times.16 arrays of luminance data and each chrominance block (or 8.times.8 array of data) that overlies each 16.times.16 array of luminance data. Some macroblocks are inter-picture motion compensated. The blocks of the motion compensated macroblocks, and the blocks of the non-motion compensated macroblocks, are then discrete cosine transformed, quantized, zig-zag (or alternate) scanned into a sequence, run-level encoded and then variable length encoded.
The MPEG-1 and MPEG-2 syntaxes specify a hierarchical organization for formatting a compressed video bitstream. Levels or layers in the hierarchy are provided for each of the following: a sequence of pictures, a group of pictures in the sequence, a picture in a group, a slice (or contiguous sequence of macroblocks) in a picture, a macroblock of a slice and a block of a macroblock. A header is provided for the sections of the bitstream corresponding to each of the sequence, group of pictures, and picture layers of the hierarchy. Other preliminary information similar to a header is inserted before the sections of each of the slice (e.g., slice.sub.-- start.sub.-- code) and macroblock (e.g., macroblock.sub.-- address.sub.-- increment and macroblock.sub.-- type) layers. Other (optional) sections may be inserted into the sequence sections of the sequence layer (e.g., sequence extension, sequence displayable extension, sequence scalable extension, and user data(0)), the group of pictures sections of the group of pictures layer (e.g., extension data (1) and user data (1)) or the picture sections of the picture layer (e.g., picture coding extension, quant matrix extension, picture displayable extension, picture temporal scalable extension, picture spatial scalable extension, copyright extension, and user data (2)). An example of a picture section includes a picture header, a user data(2) section and compressed data for one picture formatted into multiple slice, macroblock and block sections.
MPEG defines no manner for carrying closed-captioning data. As such, there are at least four different syntaxes for carrying closed captioning data in the user data (2) section of a picture section. One or more of the syntaxes can also be used for formatting XDS data, which is an infrequently used additional pair of bytes carried in the odd field of a frame. These syntaxes are set forth in Tables 1-4 below. Below, nextbits() is a function that examines a number of bits in a sequence, "uimsbf" means "unsigned integer, most significant bit first" and "bslbf" means "bit string, leftmost bit
TABLE 1 __________________________________________________________________________ Syntax 1 Field name # of Bits Mnemonic __________________________________________________________________________ picture.sub.-- user.sub.-- data() { user.sub.-- data.sub.-- start.sub.-- code 32 while (nextbits() != 0x000001) { user.sub.-- data.sub.-- length 8 uimsbf user.sub.-- data.sub.-- type 8 uimsbf if (user.sub.-- data.sub.-- type == 0xff) { next.sub.-- user.sub.-- data.sub.-- type 8 uimsbf } if (user.sub.-- data.sub.-- type == 0x09) { cc.sub.-- data.sub.-- bytes 8*2 uimsbf } else if (user.sub.-- data.sub.-- type == 0x0a) { eds.sub.-- data.sub.-- bytes 8*2 uimsbf } else if (user.sub.-- data.sub.-- type == 0x02 .parallel. .sup. user.sub.-- data.sub.-- type == 0x04) reserved 8*user.sub.-- data.sub.-- length-1 uimsbf else reserved 8*user.sub.-- data.sub.-- length uimsbf } where: user.sub.-- data.sub.-- start.sub.-- code is the start code sequence defined in MPEG indicating that a user data section follows. user.sub.-- data.sub.-- type is an indication of the type of data stored in the field. User.sub.-- data.sub.-- types specified by the constants 0x09 and 0x0a indicate closed captioning data. Other types are defined but are not discussed herein. user.sub.-- data.sub.-- length is the number of data bytes following user.sub.-- data.sub.-- type before the next user.sub.-- data.sub.-- length field, unless user.sub.-- data.sub.-- type is the constant 0x02 or the constant 0x03, in which case user.sub.-- data.sub.- - length is the number of data bytes following user.sub.-- data.sub.-- length before the next user.sub.-- data.sub.-- length field. cc.sub.-- data.sub.-- bytes are two bytes of closed captioning data. eds.sub.-- data.sub.-- bytes are two bytes of EDS data which are treated as closed captioning data herein. __________________________________________________________________________
TABLE 2 ______________________________________ Syntax 2 Mne- Field name # of Bits monic ______________________________________ picture.sub.-- user.sub.-- data() { user.sub.-- data.sub.-- start.sub.-- code 32 while (nextbits() != 0x000001) { user.sub.-- data.sub.-- length 8 uimsbf user.sub.-- data.sub.-- type 8 uimsbf if (user.sub.-- data.sub.-- type == 0xff) { next.sub.-- user.sub.-- data.sub.-- type 8 uimsbf } if (user.sub.-- data.sub.-- type == 0x09) { cc.sub.-- data.sub.-- bytes 8*(user.sub.-- data.sub.-- length-1) uimsbf } else if (user.sub.-- data.sub.-- type == 0x0a) { eds.sub.-- data.sub.-- bytes 8*(user.sub.-- data.sub.-- length-1) uimsbf } else reserved 8*(user.sub.-- data.sub.-- length-1) uimsbf } ______________________________________
TABLE 3 __________________________________________________________________________ Syntax 3 Field name # of Bits Mnemonic __________________________________________________________________________ user.sub.-- data() { user.sub.-- data.sub.-- start.sub.-- code 32 bslbf ATSC.sub.-- identifier = `0x4741 3934` 32 bslbf user.sub.-- data.sub.-- type.sub.-- code 8 uimsbf if (user.sub.-- data.sub.-- type.sub.-- code == 0x03){ process.sub.-- em.sub.-- data.sub.-- flag 1 bslbf process.sub.-- cc.sub.-- data.sub.-- flag 1 bslbf additional.sub.-- data.sub.-- flag 1 bslbf cc.sub.-- count 5 uimsbf em.sub.-- data 8 bslbf for (i=0; i&lt;cc.sub.-- count; i++){ marker.sub.-- bits=`11111` 5 bslbf cc.sub.-- valid 1 bslbf cc.sub.-- type 2 bslbf cc.sub.-- data.sub.-- 1 8 bslbf cc.sub.-- data.sub.-- 2 8 bslbf } marker.sub.-- bits=`1111 1111` 8 bslbf if(additional.sub.-- data.sub.-- flag){ while {nextbits() != `0000 0000 0000 0000 0000 0001`){ additional.sub.-- user.sub.-- data 8 } } } where: ATSC.sub.-- identifier is a constant defined to be `0x4741 3934`. user.sub.-- data.sub.-- type.sub.-- code is defined to be the constant 0x03. process.sub.-- em.sub.-- data.sub.-- flag is a flag indicating the need to process emergency broadcast message data. process.sub.-- cc.sub.-- data.sub.-- flag is a flag indicating the need to process closed captioning data. additional.sub.-- data.sub.-- flag is a flag indicating the presence of additional data. cc.sub.-- count is a counter indicating the number of pairs of closed captioning bytes present. em.sub.-- data is the emergency broadcast data. cc.sub.-- valid is a flag indicating that there is valid closed captioning data in this user data section. cc.sub.-- type denotes the type of closed captioning data present and follows the convention set forth in EIA, Recommended Practice for Advanced Television Closed Captioning, draft July 1, 1994. `00` denotes closed captioning, `01` denotes XDS, `10` denotes ATVCC Channel Packet Data and `11` denotes ATVCC Channel Packet Start. Note that cc.sub.-- type's `10` and `11` are defined for HDTV only. However, the invention described herein can nevertheless parse such information. cc.sub.-- data.sub.-- 1 are the closed captioning data bytes for this picture (and, cc.sub.-- data.sub.-- 2 when cc.sub.-- count&gt;1, the closed captioning data bytes for other omitted, subsequent pictures). __________________________________________________________________________
TABLE 4 __________________________________________________________________________ Syntax 4 Field name # of Bits Mnemonic __________________________________________________________________________ picture.sub.-- user.sub.-- data() { user.sub.-- data.sub.-- start.sub.-- code 32 bslbf user.sub.-- data.sub.-- type.sub.-- code 8 uimsbf if (user.sub.-- data.sub.-- type.sub.-- code == 0x03){ reserved 7 bslbf valid.sub.-- flag 1 bslbf if(valid.sub.-- flag == 0x01){ cc.sub.-- count 5 uimsbf for (i=0; i&lt;cc.sub.-- count; i++){ reserved 2 bslbf cc.sub.-- type 2 bslbf reserved 5 bslbf cc.sub.-- data.sub.-- 1 8 bslbf cc.sub.-- data.sub.-- 2 8 bslbf `1` (marker bit) 1 bslbf } } } reserved n bslbf next.sub.-- start.sub.-- code() where: user.sub.-- data.sub.-- type.sub.-- code is defined to be the constant `0x3`. valid.sub.-- flag is a flag indicating when the closed captioning data is valid. cc.sub.-- type is `01` for closed captioning data and `10` for XDS data. next.sub.-- start.sub.-- code() is the next MPEG compatible start code. __________________________________________________________________________
For purposes of identifying closed captioning data, syntax 2 is very similar to syntax 1. The differences are as follows. In syntax 1, one or more data groups follow the user.sub.-- data.sub.-- start.sub.-- code, where each data group includes a user.sub.-- data.sub.-- length, a user.sub.-- data.sub.-- type and two closed captioning bytes. However, the user.sub.-- data.sub.-- length of each data group is always the constant 0x03 because the length of the one byte field user.sub.-- data.sub.-- type, which precedes each pair of closed captioning bytes, is added to the length of the closed captioning bytes. In syntax 2, data groups containing user.sub.-- data.sub.-- type's 0x09 and 0x0a can have either two or four closed captioning bytes. The first pair of bytes corresponds to the same picture (i.e., frame) containing the user data section in which the closed captioning bytes are found. The second pair of closed captioning bytes correspond to a subsequent picture which was contained in the original unencoded picture sequence but was omitted from the encoded video signal (e.g., because the subsequent frame was a repeat field, detected during an inverse telecine process of the encoding, or for some other reason).
Considering the different closed captioning methods employed, there is no guarantee that a decoder will be able to parse the closed captioning data. This makes it is difficult to manufacture interoperable encoding and decoding equipment. Furthermore, it is a non-trivial task to determine in real time which syntax has been used for encoding the closed captioning data at the time of decoding.
It is therefore an object of the present invention to overcome the disadvantages of the prior art.