The present invention relates to the field of processing and analyzing streaming digital video image signals and, more particularly, to a method for determining entropy of a pixel of a real time streaming digital video signal. The method of the present invention is particularly applicable for identifying the origin of, and processing, in real time, pixels of interlaced, non-interlaced, or de-interlaced, streaming digital video image signals, and for correcting errors produced during editing of streaming digital video image signals.
A streaming digital video image signal is represented as continuous sequences of either fields, according to an interlaced scan refresh format, or frames, according to a non-interlaced or progressive scan refresh format. In the interlaced scan format, a digital video image signal in the form of a single image (frame) is represented using a pair of fields. One field of the pair features pixels located in alternate horizontal lines (rows), for example, odd numbered horizontal lines, of the field matrix. The second field of the pair features pixels located in the same field matrix only in the corresponding horizontal lines, for example, even numbered horizontal lines, which are missing pixels in the first field, such that portions of the image not represented in the first field are thereby represented in the second field. In the interlaced scan format, each field of image data is scanned twice, once for the odd numbered horizontal lines of the field, and another time for the even numbered horizontal lines of the field, in order to have all of the horizontal lines of the odd field followed by all of the horizontal lines of the even field. The pair of fields of odd and even horizontal lines in interlaced video constitute the frame (one full resolution picture or image). By contrast, in the non-interlaced or progressive scan format, a digital video image signal is represented in its entirety using only a single field which includes pixels in all horizontal lines of the field matrix. Here, each frame or field of image data is scanned once from the top horizontal line to the bottom horizontal line without requiring interlacing action between two fields.
In an interlaced scan format, the first and second fields of a pair are scanned consecutively on a video display monitor at a pre-determined rate of a number of, for example, 60, fields per second, in order to reconstruct single image frames on the display at a standard broadcasting interlaced scan rate of a number of, for example, 30, frames per second. In more recently developed video representation techniques, such as non-interlaced or progressive scan format, frames are progressively scanned on a display at a standard progressive display rate of 60 frames per second.
Application of current interlaced scan format to television is typically according to the NTSC (National Television System Committee) standard format, or, according to the PAL (Phase Alternation by Line) standard format. In the NTSC format, there are 262.5 horizontal scanning lines per field (including one odd numbered field, and one even numbered field), translating to 525 scanning lines per frame, with an established scan rate of (60 fields) 30 frames per second. In the PAL format, there are 312.5 horizontal scanning lines per field (including one odd numbered field, and one even numbered field), translating to 625 scanning lines per frame, with an established scan rate of (50 fields) 25 frames per second.
Currently, regular video broadcasting by systems using NTSA, PAL, or, SECAM, types of standard formats, often incorporates a mixing of video image signals acquired by more than one type of video camera source, such as various combinations of interlaced video, non-interlaced or progressive video, non-interlaced Hollywood movie film, and, non-interlaced computer graphics, camera sources. If the camera source acquires image signals according to a non-interlaced or progressive type format and the broadcasting is of an interlaced type format, image signals acquired by the non-interlaced or progressive camera source need to be converted into an interlaced type format. Alternatively, if the broadcasting is of a non-interlaced or progressive type format, such as an HDTV progressive type format, and the digital video image signal display is of an interlaced type format, here, the broadcast video image signals need to be converted into an interlaced type format for proper display.
Currently, in most broadcasting systems, non-interlaced or progressive image sources are converted into interlaced formats to match the broadcasting interlaced format. However, if the broadcasting is of an interlaced type format, of a mix of originally interlaced video image signals, of interlaced originally non-interlaced video images, and of interlaced originally progressive video image signals, and the digital video image signal display is of a non-interlaced or progressive type format, here, the broadcast interlaced digital video image signals need to be de-interlaced into a non-interlaced or progressive type format.
New high quality, high resolution, TV display systems and devices, such as CRT PC monitors, high definition television (HDTV) desktop or workstation display monitors, flat liquid crystal device (LCD) panels, plasma display panels (PDP), home theater projectors, and video equipment, operate according to non-interlaced progressive high resolution scan format, such as VGA(480 lines×640 columns per frame), SVGA(600 lines×800 columns per frame), XGA(768 lines×1024 columns per frame), and UXGA(1200 lines×1600 columns per frame) to scan and display digital video images. An example showing application of de-interlacing interlaced digital video image signals involves the use of a typical LCD display having 480 horizontal scanning lines with 640 dots per scanning line (VGA system). Since LCD display systems are designed to scan according to a non-interlaced progressive format, in order to display NTSC (525 lines per frame) or PAL (625 lines per frame) digital video image signals on the LCD display, interlaced digital video image signals need to be converted into de-interlaced digital video image signals for proper display on the LCD.
In order to properly convert broadcast digital video image signals into appropriately corresponding interlaced or progressive digital video image signal display formats, there is a need for real time identifying the original mode or type of camera source of the digital video image signals. There are various prior art teachings of methods, device, and systems, for identifying the original mode or type of camera source of digital video image signals.
In U.S. Pat. No. 4,982,280, issued to Lyon et al., there is disclosed a motion sequence pattern detector which automatically detects a periodic pattern of motion sequences within a succession of fields of video image signals, characterized by film mode or progressive scan mode, thereby indicating that a particular sequence originated from cinematographic film, or, from a progressive scan camera. Therein is particularly described a three-to-two (3:2) film to video sequence mode detector which automatically detects the presence of sequences of a video stream originating from transfer from cinematographic film according to a 3:2 pull down conversion performed during the transfer. The invention disclosed by Lyon et al. also includes a resynchronization procedure for automatically resynchronizing to film mode when a video splice in video originating with film occurs on a boundary other than a 3:2 pull down conversion field boundary.
The motion sequence pattern detector of Lyon et al. is based on using a pixel displacement circuit for detecting displacement of a pixel within successive video frames for each field of the video sequence and having a motion signal output indicative of displacement due to the unique properties attributable to the video sequence. Accordingly, the teachings of Lyon et al. focus and entirely depend upon, and are therefore limited to, using a process of pattern recognition, in general, and pattern recognition of motion sequences, in particular, among sequences of fields in a streaming video signal.
In U.S. Pat. No. 5,291,280, issued to Faroudja et al., there are disclosed techniques for motion detection between even and odd fields within 2:1 interlaced conversion television standard. Therein is described a field motion detector for use within a 2:1 interlaced temporal video signal stream in conjunction with a frame motion detector operating upon the same signal stream, in order to identify motion on a field by field basis. Similar to operation of the motion sequence pattern detector disclosed by Lyon et al., the teachings of Faroudja et al. focus and entirely depend upon, and are therefore limited to, using a process of pattern recognition, in general, and, pattern recognition of motion sequences, in particular, among sequences of fields in a streaming video signal.
In U.S. Pat. No. 5,452,011, issued to Martin et al., there is disclosed a method and apparatus for determining if a streaming video signal has characteristics of a signal originating from an interlaced video mode, or, has characteristics of a signal originating from a non-interlaced film mode. The teachings of Martin et al. are based upon, and limited to, evaluating ‘accumulated’, not ‘individual’, differences of pixels of two fields in successive frames of the video signal, and comparing the accumulated differences to threshold values, for making decisions regarding further processing of the video signal.
In U.S. Pat. No. 4,967,271, issued to Campbell et al., there is disclosed a television scan line doubler, for increasing the number of apparent scan lines of a display device, including a temporal median filter for pixel interpolation and a frame-by-frame based motion detector. The invention is based on methods of interpolation and use of an interpolator unit.
In Japanese Patent Application Publication No. 00341648 JP, published Dec. 08, 2000, of Japanese Patent Application No. 11145233, filed May 25, 1999, of applicant Pioneer Electronic Corp., there is disclosed a video signal converting and mode detector device for speedily identifying whether or not an input video signal is a telecine-converted video signal from a projection or movie film, or, the video signal of a standard television system, on the basis of respective correlation values between an interpolation field and fields before and after one field period. A motion detector detects the motion of an image and a scan line interpolator performs scan line interpolating processing corresponding to detection outputs from a mode detector and the motion detector.
In the publication, Transactions On Consumer Electronics, August 1999, by Schu, M. et al., of Infineon Technologies AG, Munich, Germany, there is described a “System On Silicon—IC For Motion Compensated Scan Rate Conversion, Picture-In-Picture processing, Split Screen Applications And display Processing”, including the use of a method for “3-D-predictive motion estimation”. The described 3-D predictive motion estimation algorithm uses vector information of previously calculated blocks (fields) in advance to calculate the motion vector for the actual block (field). The source vectors are taken from the same temporal plane (spatial components) and the previous temporal plane (temporal components) and used as prediction. Combinations of these predictors and a zero-vector are used for building a set of candidate vectors for the actual block (field). The block (field) positions pointed to by these candidates are evaluated by comparison criteria, in particular, using Summed Absolute Difference (SAD) criteria, where the absolute difference between the blocks (fields) is summed pixel by pixel. The best match is chosen and its assigned vector is taken as predictor for the next blocks (fields) and for use in the scan rate conversion.
Application and suitability of any of these particular techniques for identifying the original mode or type of camera source of digital video image signals strongly depends on the resulting video image quality. Moreover, success in applying a particular mode or motion identification technique varies with overall system operating conditions, and/or, with specific video image signal processing conditions.
Due to the fact that TV station and video broadcasting systems are increasingly broadcasting various mixes of video image signals acquired by a variety of video camera sources such as interlaced video, non-interlaced or progressive video, non-interlaced Hollywood movie film, and non-interlaced computer graphics, camera sources, operating according to different formats, coupled with the continued widespread usage of interlaced format TV display devices and systems, along with increasing appearance and usage of progressive TV display devices and systems, there is a significant on-going need for developing new approaches and techniques which are applicable for real time identifying the original mode or type of camera source of digital video image signals, in order to properly convert the broadcast digital video image signals into an interlaced or progressive format corresponding to the digital video image signal display format. Moreover, there is a corresponding on-going need for developing new approaches and techniques which are applicable for real time correcting errors produced during editing of the digital video image signals.
There is thus a need for, and it would be highly advantageous to have a method for determining entropy of a pixel of a real time streaming digital video image signal, which is particularly applicable for identifying the origin of, and processing, in real time, pixels of interlaced, non-interlaced, or de-interlaced, streaming digital video image signals, and for correcting errors produced during editing of streaming digital video image signals.
Moreover, there is a need for such an invention for analyzing and processing fields of pixels of a streaming digital video image signal which (1) is independent of the type of the mode conversion used for generating the original streaming digital video image input signal, for example, a 3:2 or 2:2 pull down mode conversion method for converting film movies appropriate for a DVD disk player operating with a video NTSC or PAL format, and (2) is not based upon known methods or techniques involving ‘a priori’ pattern recognition based upon known methods or techniques involving evaluation of sets of ‘accumulated’ or ‘summed’ differences, instead of ‘individual’ differences, of pixel values located in successive fields of the streaming digital video image signal.