It is common to have material that was originally created at different frame rates for display in a video presentation. Typically, content that have different frame rates are converted to a common frame rate for display. For a description of issues associated with the display of different types of video content and frame rate conversion, refer now to the following.
Display Characteristics
Displays can be broken down into two broad categories, analog and digital. Each display type has its own assets and liabilities.
Analog Displays
Analog displays or CRTs (Cathode Ray Tubes) have two subcategories, progressive and interlaced. Progressive displays, such as computer monitors, display complete frames. They draw each line of the image consecutively. They draw line 1, line 2, line 3 etc. For interlaced displays, such as a conventional television set, the image is divided up into 2 fields. The first field contains the odd lines and is often called the odd field, and the second field contains the even lines which is commonly called the even field. The interlaced display performs a temporal interleaving of the odd and even fields to create the complete video images.
Both display types have similar characteristics. The horizontal display timing must remain constant. Instantaneous changes in the horizontal timing may cause a shifting of the video information due to the relatively narrow bandwidth of the PLL (phased-locked loop) used in the horizontal deflection circuit. The vertical display timing may be able to tolerate some variation since the vertical deflection circuit does not make use of a PLL and may be reset instantaneously. Due to the impulsive nature (low temporal fill factor) of the light output from a CRT, operation at a low frame rate will cause large area flicker to become visible. This appears as a shimmering of the entire image. Therefore, CRTs are typically operated at higher frame rates than some other types of displays.
Digital Displays
Such devices as LCD panels and plasma panels are considered digital displays. These displays are typically pixelated. That is, these display a specific number of pixels per line and lines per frame. Digital displays typically display progressive images. Interlaced images must usually be converted to progressive ones before the images can be displayed. Some typical characteristics of such displays are the following:    Horizontal active video (number of active pixels) must remain constant,    Vertical active video (number of active lines) must remain constant,    The total horizontal timing can vary,    The total vertical timing can vary.
Due to the sample and hold nature (high temporal fill factor) of the light output from some digital displays (in particular LCDs), operation at a low frame rate is often possible without the side effect of large area flicker.
Content Characteristics
There are different types of material content, such as film content and different types of video content.
Film Content
FIG. 1 illustrates film content and displayed images. Film content is traditionally shot and displayed at 24 Hz rate. This means that 24 images are shown every second. Each image contains all of the content at once, much like a photograph. Typically, in movie theaters each film frame is shuttered at 48 Hz or 72 Hz to avoid large area flicker.
Video Content
Video content is displayed in two formats, interlaced and progressive. In an interlaced format each video image or frame consists of two fields, an odd field and an even one. The odd field consists of the odd lines of the image and the even field consists of the even lines of the image. These fields are interleaved temporally and spatially. FIG. 2 illustrates a video frame 10 separated into fields 12 and 14.
The dotted lines in the frame snapshot are the odd sampled lines of the image. The solid lines in the frame snapshot are the even sampled lines of the image. When an interlaced display device draws the image, it draws the odd lines in the first field time slot, and in the second field time slot, it draws the even lines in such a way as to place the even lines in between the odd lines.
For 60 Hz countries like the USA, the video frame rate is 30 Hz. For 50 Hz countries like Germany, the video frame rate is 25 Hz. Typical interlaced video systems that use these frame rates are 480i and 1080i (NTSC), 576i (PAL) and 1080i (PAL).
For progressive images, each image is complete. Each line of the image, odd and even, is drawn sequentially. This is similar to the way a film is displayed, except that it is done in a raster format, that is, it is sampled into lines. Computer displays, for example, typically use this format. In the video domain 480P and 720P operate in this fashion.
Whether an image is interlaced or progressive, both formats share the concept of active video versus total video. FIG. 3 illustrates an entire video frame 11. It is comprised of 2 elements, active video and blanking The white area shown in the middle is the active video 13. It is defined by two parameters, HACTIVE and VACTIVE. HACTIVE defines the number of pixels per line that comprise the displayed image. VACTIVE defines the number of lines that comprise the displayed image. The black area is referred to as the blanking interval 15. Horizontal blanking is defined as the number of pixels per line (HTOTAL) minus the number of active pixels per line (HACTIVE). The vertical blanking interval is defined as the number of lines per field/frame (VTOTAL) minus the active number of lines per field/frame (VACTIVE).
Video Frame Rate Conversion
Frame rate conversion of the video in its simplest terms is the translation from one image rate to another. There are many different types of frame rate conversions ranging from a simple drop/repeat method to a sophisticated motion compensated method. The simple drop and repeat method is commonly used because it is easiest to implement as a system and is most cost effective.
To describe the simple drop and repeat method in more detail, first frame rates must be discussed. There are many types of frame rates for video systems. As mentioned previously 2 common frame rates used in video are 25 Hz (PAL) and 30 Hz (NTSC). Native video material, such as sporting events or news programs, is created by taking a field snapshot every 1/50 sec or 1/60 sec. Each field contains an image of the subject at a unique point in time.
Since film material was originally created at 24 Hz, it must be modified to be used in the video domain. For 60 Hz countries, like the United States, a conversion process known as 3:2 pulldown is used. The first film frame is displayed over 3 video fields and the second film frame is displayed over 2 video fields. Imagine a film sequence of frames ABCD. Each frame is captured 1/24 sec after the last one. When converted into video, the sequence would look like Aodd, Aeven, Aodd, Beven, Bodd, Ceven, Codd, Ceven, Dodd, Deven where the subscript refers to the odd field or the even field and each contains the odd lines or the even lines of the progressive film frame, respectively. This is shown in FIG. 6. For 50 Hz countries, like Germany, 2:2 pulldown is used. In this case each film frame is displayed over 2 video fields. Imagine the same film sequence used above, in a 2:2 pulldown conversion scenario. The video sequence would look like Aodd, Aeven, Bodd, Beven, Codd, Ceven, Dodd, Deven.
Referring now to a 3:2 frame rate, different up and down conversions can be created depending upon which frame is dropped or repeated. When up converting by a ratio of 6:5 a 3:2 sequence like AAABB may be converted to AAAABB or AAABBB. When down converting by a ratio of 4:5 the same sequence possible outputs include AAAB and AABB. These two up and down conversions are very different visually. For the symmetric up and down conversion, AAABBB and AABB, the motion judder is decreased compared to the input. Motion judder is the difference between where the eye expects the image to be and where it actually is. In these cases each film image is displayed for exactly the same amount of time. For the asymmetric sequences AAAABB and AAAB the motion judder is increased compared to the input. So care must be taken when doing up and down conversions on asymmetric film material so as to not increase the amount of motion judder. This procedure is more fully described herein below.
Up Conversion
In an up conversion or repeat frame scenario, if the last output frame is finished being displayed before the next input frame is available for reading then the last input frame is drawn again. This is the method of frame rate up conversion. FIG. 4 illustrates frame rate up conversion. Note that frame A is displayed twice. The repeated frame is highlighted by shading. For up conversion, one of the input frames is repeated at the output. In the case of an input sequence conforming to a 3:2 pull down pattern, up conversion can be advantageously performed by repeating the frame which occurs least frequently in the input. Thus the input sequence would be converted from 3:2 to 3:3. Maintaining the symmetry of the output sequence in this way reduces the detrimental appearance of motion judder. For material not conforming to the 3:2 pull down pattern, such as video, up conversion will introduce a periodic motion judder, which is undesirable.
Down Conversion
When the display frame rate is slower than the input frame rate, we need to drop input fields periodically. This is the method of frame rate down conversion. FIG. 5 illustrates frame rate down conversion.
For frame rate down conversion, one of the input frames is dropped from the output sequence. This is done when two input frames are ready for reading before the last output frame is fully displayed. Then only the latest input frame is used and the earlier one is dropped as shown in FIG. 5. It can be seen that input frame D, highlighted by shading, has been dropped from the output frame sequence.
In the case of an input sequence conforming to a 3:2 pull down pattern, down conversion is advantageously performed by dropping the frame, which occurs most frequently in the input. Thus, the input sequence would be converted from 3:2 to 2:2, or even 1:1, for example. This will reduce the amount of motion judder visible to the viewer. Performing down conversion on non 3:2 conforming material such as native video material, will introduce a periodic motion judder, which is undesirable.
Motion Compensation
A more sophisticated approach to frame rate conversion involves motion compensation. In this technique, motion vectors are estimated for each pixel or group of pixels in a frame, indicating how much an image detail may have moved from one frame to the next. This information is then used to temporally interpolate an entirely new frame, which represents the state of the scene at a point somewhere in time between the two nearest input frames. Thus, frames are generated with precisely the correct temporal flow, as opposed to the stuttered motion that is produced when frames are simply dropped or repeated. Although it may offer better performance, this technique is less common in consumer devices due to its relatively high cost.
Film Frame Rate Conversion
Television programs and commercials may frequently contain a mixture of video and film. For example, when a movie is broadcast to TV viewers, commercials or newsbreaks are inserted during the movie presentation. Now we have a mixture of frame rate content. The movie was film originally shot at 24 Hz and converted to video, and the newsbreak is created as native video. If the display device operates at the video frame rate, then the movie will be sub-optimally displayed and motion judder will be apparent while the newsbreak will be displayed shown perfectly. Conversely, if the display device operates at a rate optimal for the movie, then the newsbreak will have motion judder.
Accordingly, when a film section is combined with a video section to provide a video presentation, the film content of the video presentation will include the judder based upon it being presented at the 3:2 frame rate. The problem of judder can also occur when video content of different frame rates are combined. Therefore it is desirable to be able to process combined content, which has different frame rates optimally in a manner that judder is eliminated, or reduced and no artifacts are shown. Heretofore, there have been no systems that provide this optimal delivery of the video presentation when content of different frame rates is put together. The system and method must be cost effective, easily implemented and adaptable to existing video presentation systems.
Accordingly, what is needed is a system and method for displaying a video presentation in an optimal manner when the content with different frame rates is provided together. The present invention addresses such a need.