1. Field of the Invention
This disclosure is related to video processing, and more particularly, to detecting scrolling text in a mixed-mode video sequence.
2. Description of the Related Technology
Multimedia processing systems, such as video encoders, may encode multimedia data using encoding methods based on international standards such as MPEG-x and H.26x standards. Such encoding methods generally are directed to compressing the multimedia data for transmission and/or storage and may combine both progressive and interlaced (non-progressive) sequences. Compression is, broadly speaking, the process of removing redundancy from the data. In addition, video display systems may transcode or transform multimedia data for various purposes such as, for example, to ensure compatibility with display standards such as NTSC, HDTV, or PAL, to increase frame rate in order to reduce perceived motion blur, and to achieve smooth motion portrayal of content with a frame rate that differs from that of the display device. These transcoding methods may perform similar functions as the encoding methods for performing frame rate conversion, de-interlacing, and the like.
A video signal can be described in terms of a sequence of pictures, which include frames (each frame being an entire picture), or fields (e.g., an interlaced video stream comprises fields of alternating odd or even lines of a picture). A frame may be generally used to refer to a picture, a frame or a field. Multimedia processors, such as video encoders, may encode a frame by partitioning it into blocks or “macroblocks” of, for example, 16×16 pixels. The encoder may further partition each macroblock into subblocks. Each subblock may further comprise additional subblocks. For example, subblocks of a macroblock may include 16×8 and 8×16 subblocks. Subblocks of the 8×16 subblocks may include 8×8 subblocks, and so forth. Depending on context, a block may refer to either a macroblock or a subblock, or even a single pixel.
Video sequences may be received by a receiving device in a compressed format and subsequently decompressed by a decoder in the receiving device. Video sequences may also be received in an uncompressed state. In either case, the video sequence is characterized at least by a frame rate, and a horizontal and vertical pixel resolution. Many times, a display device associated with the receiving device may require a different frame rate and/or pixel resolution and video reconstruction of one or more video frames may be performed. Reconstruction of video frames may comprise estimating a video frame between two or more already received (or received and decompressed) video frames. Furthermore, decoder devices may create new video data based on already reconstructed video data.
Frame rate conversion by pulldown is one example of new video data creation. Pulldown comprises repeating source frames in a known pattern to generate an output video signal which possesses more frames than the original. For example, when film is transferred to video, 24 frames per second of film are converted to 60 fields per second of video by “stretching” four frames of film to fill five frames of video. For instance, in an NTSC frame, there are two complete fields for each frame displayed, resulting in ten fields for every four film frames. In 3:2 pulldown, for example, one film frame is used across three fields, the next across two, the next across three, and so on. The cycle repeats itself completely after four film frames have been processed. In interlaced (non-progressive) pulldown, the two fields correspond to the even and odd interlaced fields, while in progressive pulldown, the two fields correspond to the complete frame.
One disadvantage of the 3:2 pulldown process is that it creates a slight error in the video signal compared to the original film frames that can be seen in the final image. As a result, the output video signal appears less smooth than the original version. This error is referred to as “motion judder” and may be corrected by the process of motion judder cancellation. Motion judder cancellation extracts frames from the output video signal and performs a new frame rate conversion on the extracted frames, resulting in a smooth video sequence.