A typical motion judder cancellation system comprises three parts: a film detector, a motion estimator, and a frame rate upconvertor. If the system processes interlaced image sequences, a deinterlacer can be present. The film detector detects if the incoming video sequence contains motion by analyzing information between consecutive images. This motion is classified in several common patterns of motion.                Video means there is motion every image. We also say that for video the temporal distance is 1 image. Every image is called a phase 0 image.        22 Pull-down means there is motion once every 2 images in a repetitive pattern. We will refer to the first of these as a base image. We call the base image the phase 0 image and the next image the phase 1 image. The temporal distance for 22 pull-down is 2 images. So there is no motion between consecutive images of phase 0 and 1 and there is motion between phase 1 and the new phase 0.        32 or 23 Pull-down means 3 images have no motion amongst each other and, then, 2 images have no motion amongst each other, or vice versa. Herein, the following definitions are used. In film mode, phase 0 and 3 are those images that are the first image of the repeating part. 32 pull-down would arrive in the following sequence phase 0, phase 1, phase 2, phase 3, phase 4, phase 0, phase 1, etc. Phase 0, 1 and 2 would relate the same first film image while phase 3 and 4 relate to the same second film image. Both the first of three images from the first film image, and the first of the two images from the second film image are referred to as a base image. The first base image is the phase 0 image, the next image the phase 1 image, the next image is the phase 2 image, the next image (which is a base image) is phase 3, and the next one is phase 4. The temporal distance for 32 pull-down is intermittently 3 and 2 images. So there is no motion between consecutive images of phase 0 and 1 and there is no motion between consecutive images of phase 1 and 2. There is motion between phases 2 and 3, but none between phases 3 and 4. From phase 4 to the new phase 0 there is motion again.        
Any sequences that do not fall into one of these categories could be treated in a fallback mode, which is usually equivalent to the mode selected for the video pattern, potentially causing judder. Alternatively, sequences that do not fall into these categories will be treated as either 22 pull-down or 32 pull-down, potentially causing severe artifacts in the frame rate upconvertor.
If the pattern is not video, the motion estimator will typically use this pattern to estimate motion vectors between the most recent image and the most recent image that differs from the most recent image. These vectors will not be global for the entire image, but will be localized to specific areas of the image. Thus, these vectors indicate how parts of the image move over time. The vectors will be used by the frame rate upconvertor to interpolate new images between the most recent image and the most recent image that differs from the most recent image. The frame rate upconvertor outputs these new images instead of some images of the original pattern. Because now the sequence will not appear intermittently stationary, the motion will appear smoother to a viewer, canceling the so-called judder artifact.
Unfortunately, there are some exceptions to the above categories:    1. Some TV stations, like TMF and MTV, regularly broadcast material in which the images are partly in a pull-down pattern (e.g. background) and partly in a video pattern (e.g. ticker bars, presenter). So, in 32 pull-down mode, the film part of the image of phase 0, 1 and 2 is the same, while the video part is different among these phases. The same holds for phases 3 and 4: the film part is the same, while the video part is different. In 22 pull-down mode, the film part of the image of phase 0 and 1 is the same, while the video part is different among these phases.    2. Also in digital video pictures resulting from MPEG coding, a considerable pull-down contribution may be present. Besides motion being present between each two images, in this case there is also a pull-down-like motion pattern (e.g. a higher motion contribution between phase 1 and phase 0 and a lower motion contribution between phase 0 and phase 1). Processing this type of sequences as pull-down causes serious de-interlace artifacts.
With respect to the frame rate upconvertor, falling back to video will cause judder again in example 1 for the pull-down parts. Treating it as pull-down, will cause the frame rate upconvertor to introduce a new judder-like artifact in the video component. Also, the motion estimator will motion estimate inconsistent vectors, while the deinterlacer introduces severe artifacts.
WO 02/056597 and WO 2004/054256 disclose methods for recognizing film and video occurring in parallel in television images.