Many contemporary high performance televisions, particularly large screen and wide screen versions, utilize a spatial-temporal resolution which is higher than the normal resolution and refresh rate. For example, a 100 Hertz (Hz) screen refresh rate may be employed for the television display rather than the standard 50 or 60 Hertz. However, because the field rate—the number of interlaced screen images or “fields” for the television—within the program signal received will typically be only 50 fields per second, the number of fields for display must be doubled.
For digital televisions employing a field memory—a memory with the capacity to store a digitized version of a complete television field—one technique for doubling the field rate involves simply writing to the field memory at a first rate and reading from the field memory at a second rate which is double the first rate. However, such field rate up-conversion by simple field repetition results in each movement phase (i.e., frame) being displayed multiple times, with moving objects appearing slightly displaced from their expected spatio-temporal (space-time) position in the repeated movement phases as illustrated in FIG. 5.
The space-time positioning 501a, 501b and 501c of an object moving linearly across the screen within a sequence of three fields n−2, n−1 and n is shown in FIG. 5. Field rate up-conversion by field repetition produces inter-mediate fields (not labeled) in which the space-time positioning of the object is 503a, 503b and 503c rather than the expected space-time positioning of 502a, 502b, and 502c. 
While the displacement is almost unnoticeable to the human eye at video information captured at normal field rates (50–60 Hz) employed by video cameras and the like, motion picture cameras have, for historical electro-mechanical reasons, operated at a capture rate of 24 frames per second. While modern motion picture cameras have been improved, much film exists which was recorded at that previously-standard capture rate. Such film is normally converted for television display by running the film at approximately 25 frames per second and then scanning each frame twice such that adjacent pairs of identical fields are created within the video information.
When up-converting a television formatted motion picture to a higher field rate utilizing simple field repetition, the already duplicated fields are again duplicated, creating sequences of four identical fields within the video information and resulting in a significant amount of motion jitter and picture blurring. To address these problems, motion compensation techniques such as three dimensional (3-D) recursive search block matching have been developed to provide motion-compensated interpolation. See, for example, G. de Haan, Motion Estimation and Compensation—An Integrated Approach to Consumer Display Field Rate Conversion (ISBN 90-74445-01-2) and G. de Haan et al, “True-Motion Estimation with 3-D Recursive Search Block Matching,” IEEE Tr. On Circuits and Systems for Video Technology, 3(5):368–379 (October 1993).
High definition television (HDTV) often imposes a requirement differing from—and either in addition to or in lieu of—field rate up-conversion: image resolution enhancement. As illustrated in FIGS. 6A and 6B, image resolution enhancement requires up-conversion from one resolution and the corresponding pixel size 601a and/or pixel density 602a to a higher resolution having a smaller pixel size 601b and/or greater pixel density 602b. Known interpolation techniques are employed to generate the additional pixels required from the original video information.
As known in the art, the shape or magnitude of edges within an image significantly contribute to the overall impression of “sharpness” for the image. Accordingly, various edge enhancement techniques such as frequency peaking and luminance transient improvement (LTI) have been developed for use during image resolution enhancement. Frequency peaking involves linear boosting or “peaking” of selected spatial frequencies within the image, often with a bandpass or highpass filter to enhances the associated spatial frequencies and with adaptive control to avoid “unnaturalness” relating to, for example, peaking large and steep edges. Unlike frequency peaking, luminance transient improvement preserves the magnitude of the edge but increases the steepness of the edge, “pulling” samples near the edge on both sides towards the edge.
Existing edge enhancement algorithms enhance the sharpness of an image based on the spatial information of the original image, often utilizing control parameters determined by a small spatial neighborhood of a given pixel position. While these techniques are generally sufficient for still images, time varying conditions within video information such as (but not limited to) noise, motion, or lighting conditions, or even spatio-temporal varying conditions, may cause annoying artifacts in the processed video information. Conservative tuning of the parameters may prevent such artifacts, but also constrains the enhancement.
There is, therefore, a need in the art for enhancement of video information with spatio-temporal consistency, or consistency of enhanced image data both with spatially surrounding (enhanced) image data in the field containing the enhanced image data and with counterpart or corresponding image data within subsequent fields.