Over the last years, the traditional cathode ray tube (CRT) display has had to face increasing competition from alternative display principles, which are mainly based on active-matrix technology. In particular, active-matrix liquid crystal displays (AM-LCDs) have increased in performance and decreased in price so dramatically, that the market share of the CRT is decreasing at a rapid pace. The main differentiating feature of these new display principles is their size: LCDs are thin, flat and lightweight. This has enabled the first market for these displays: laptop computers. By now, the LCD has also almost taken over the desktop monitor market, where not only its size has made the difference, but also its uniform, sharp, and flicker-free picture reproduction. Nowadays, the CRT is also having to face competition from the LCD in its last stronghold: television.
To make a good television display, the LCD has had to overcome previous drawbacks, for example a limited viewing angle and color performance. However, the CRT is still unbeaten in one major aspect: motion portrayal. In that area, LCDs perform much worse, since the LC-molecules that provide the basic display effect react slowly to image changes. This causes an annoying smearing (blurring) of moving objects, which makes the LCD unsuited for video applications. Therefore, a lot of effort has been put into speeding up the response of LC materials. This can be done by applying better materials, or by improved LC cell design. There is also a well known method for response time improvement based on video processing, called ‘overdrive’. Overdrive improves the response speed of the LC pixels by changing the drive values depending on the applied gray level transition. This enables a reduction of the response time to within the frame period. Currently, the best displays available list response times below the frame period (17 ms at 60 Hz). This is a crucial value, since the worst blurring artifacts are prevented for an LCD that can respond to image changes within the frame period.
However, speeding up the response of LC materials to lower values is not enough to completely avoid motion blur. This is caused by the active matrix principle itself, which exhibits a sample-and-hold characteristic, causing light emission during the whole frame time (hold-type display). This is a major difference with the very short (microsecond) light flashes produced by the phosphors of the CRT (impulse-type display). It is well known that this prolonged light emission does not match very well with the way humans perceive moving images. As will be further explained in the next sections, the human eye will track moving objects on the screen, thereby imaging the light, belonging to each fixed point in a frame, onto a series of points on the retina. This ‘point spreading’ results in a loss of sharpness of moving objects.
The basic function of a display system is to reconstruct the physical light emissions, corresponding to the original image, at the correct position and time on the screen from the received space-time discrete video signal. The characteristics of this reconstruction process, especially when combined with characteristics of the human visual system, can explain many image quality artifacts that occur in practical display systems.
The very basic representation of the signal chain 1 from original to displayed image is shown in FIG. 1. The original scene, represented as a time varying image, is a space-time-continuous intensity function Ic({right arrow over (x)},t), where {right arrow over (x)} has two dimensions: {right arrow over (x)}=(x,y)T. This original image is sampled by the camera 100 in time and space. Since the spatial sampling is outside the scope of this specification, we will refer to it only occasionally from now on. The temporal behavior, however, will be the main focus for the remainder of this specification. The sampling process is described by:Is({right arrow over (x)},t)=Ic({right arrow over (x)},t)·Λ({right arrow over (x)},t),  (1)
where Λ({right arrow over (x)},t) is a three-dimensional lattice of δ-impulses. We can assume a rectangular sampling lattice, which is described by sampling intervals Δ{right arrow over (x)}=(Δx,Δy) and Δt:
                              Λ          ⁡                      (                                          x                →                            ,              t                        )                          =                              ∑                          k              ,              l              ,              m                                ⁢                                    δ              ⁡                              (                                  x                  -                                                            k                      ·                      Δ                                        ⁢                                                                                  ⁢                    x                                                  )                                      ·                          δ              ⁡                              (                                  y                  -                                                            l                      ·                      Δ                                        ⁢                                                                                  ⁢                    y                                                  )                                      ·                                          δ                ⁡                                  (                                      t                    -                                                                  m                        ·                        Δ                                            ⁢                                                                                          ⁢                      t                                                        )                                            .                                                          (        2        )            
The reconstruction of the physical light emission by the display 101 can be described by a convolution with the display aperture (also known as reconstruction function or point spread function). This aperture is also a function of space and time: A({right arrow over (x)},t). The image, as produced by the display 101, becomes:
                                                                                          I                  d                                ⁡                                  (                                                            x                      →                                        ,                    t                                    )                                            =                                                                    I                    s                                    ⁡                                      (                                                                  x                        →                                            ,                      t                                        )                                                  *                                  A                  ⁡                                      (                                                                  x                        →                                            ,                      t                                        )                                                                                                                          =                                                (                                                                                    I                        c                                            ⁡                                              (                                                                              x                            →                                                    ,                          t                                                )                                                              ·                                          Λ                      ⁡                                              (                                                                              x                            →                                                    ,                          t                                                )                                                                              )                                *                                  A                  ⁡                                      (                                                                  x                        →                                            ,                      t                                        )                                                                                                          (        3        )            
The two operations of sampling and reconstruction account for a number of characteristic differences between the displayed image and the original image. These are best described by a frequency domain description, so we apply the Fourier transform F(F({right arrow over (x)},t))=Ff({right arrow over (f)}x,ft) to Eq. (3):Idf({right arrow over (f)}x,ft)=(Icf({right arrow over (f)}x,ft)*Λf({right arrow over (f)}x,ft))·Af({right arrow over (f)}x,ft),  (4)
where the Fourier transform Λf({right arrow over (f)}x,ft) of lattice Λ({right arrow over (x)},t) is the reciprocal lattice, with spacings (Δx)−1, (Δy)−1 and (Δt)−1 (the frame rate).
The spatio-temporal spectrum of the original image, the sampled image, the displayed image and the finally perceived image as a function of the normalized temporal frequency ftΔt and the normalized spatial frequency fxΔx are depicted in the four plots of FIG. 2, respectively, for the case of an impulse-type (CRT) display. To simplify the illustration, we omit the spatial repeats, as if the signal was continuous in the spatial dimension. For the displayed images, this is equivalent to assuming that the spatial dimension has been reconstructed perfectly, i.e. the original continuous signal was spatially band-limited according to the Nyquist criterion, and the reconstruction effectively eliminates the repeat spectra.
In the temporal dimension, the impulse nature of the light emission gives a flat reconstruction spectrum. As a consequence of this flat spectrum, the temporal frequencies in the baseband ft<(2Δt)−1 are not attenuated, but also at least the lowest order repeats are passed.
The image, as it is finally perceived by the viewer, is also determined by the characteristics of the human visual system (HVS). In the temporal domain, the HVS mainly behaves as a low-pass filter, since it is insensitive to higher frequencies. The fourth plot of FIG. 2 shows that the perceived image is identical to the original image (cf. first plot of FIG. 2), if we assume that the eye's low-pass eliminates all repeat spectra. This assumption is not always true, which leads to one of the most widely known artifacts in display systems: large area flicker. This is caused by the first repeat spectrum (at low spatial frequencies) that is not completely suppressed for frame rates approximately smaller than 75 Hz.
Active-matrix displays like LCDs do not have an impulse-type light emission. The fastest displays that are currently available have response times shorter than the frame period. However, even these will still have a light emission during the whole frame period due to the sample-and-hold behavior of the active matrix and the continuous illumination by the backlight. This behavior results in a temporal “box” reconstruction function with a width equal to the hold time Th. In the frequency domain, this becomes a sinc characteristic:Af({right arrow over (f)}x,ft)=sin c(πftTh)  (5)
The spectrum of the sampled image, of the aperture A({right arrow over (x)},t), of the displayed image and of the finally perceived image for such a hold-type display are depicted in the four plots of FIG. 3, respectively. This immediately shows a distinctive advantage of hold-type displays over impulse-type displays: the sinc characteristic suppresses the repeat spectra in the displayed image (cf. the third plot of FIG. 3), and even has zero transmission at the sampling frequency. This eliminates large area flicker at all frame rates.
It may seem that the sample-and-hold behavior of the hold-type displays results in a better display than an impulse-type light emission. For static images this is indeed the case. However, the conclusion changes for a moving image:Im({right arrow over (x)},t)=Ic({right arrow over (x)}+{right arrow over (v)}t,t),  (6)
where {right arrow over (v)} is the speed of the moving image over the screen, measured here in the same units that are used for {right arrow over (x)} and t. When the sampling intervals Δ{right arrow over (x)}=(Δx,Δy) are known, {right arrow over (v)} can also be expressed in “pixels per frame”. This corresponds to the “motion vector” or “frame displacement vector”.
Eq. (6) can also be transformed to the frequency domain, where it becomes:Imf({right arrow over (f)}x,ft)=Icf({right arrow over (f)}x,ft−{right arrow over (v)}·{right arrow over (f)}x).  (7)
This movement results in a shearing of the spectrum as shown in the second plot of FIG. 4, in comparison to the spectrum of the still original image in the first plot of FIG. 4. The shearing of the spectrum reflects that spatial variations in a moving object will generate temporal variations.
This moving image is then sampled (cf. the third plot of FIG. 4) and reconstructed in the display chain, after which it reaches the eye. The perception of moving images is characterized by another important property of the HVS: the eye tracking. The viewer tries to follow moving objects across the screen in order to produce a static image on the retina. This mechanism is well studied, and enables the HVS to perceive moving images with a high level of detail. The image on the retina of an eye tracking viewer is described by the inverse of the relations in Eqs. (6) and (7):Ie({right arrow over (x)},t)=Id({right arrow over (x)}−{right arrow over (v)}t,t)Ief({right arrow over (f)}x,ft)=Idf({right arrow over (f)}x,ft+{right arrow over (v)}·{right arrow over (f)}x)  (8)
The whole chain 5 from original image to perceived image, comprising a motion instance 500 (due to moving objects), a sampling instance 501 (e.g. a camera), a reconstruction instance 502 (e.g. a display), a tracking instance 503 (the viewer tracking the motion) and a low-pass filter 504 (the eye), is shown in FIG. 5. Substituting Eq. (3) in Eq. (8) and applying Eq. (7), gives the image as projected onto the retina of the eye tracking viewer:
                                                                                          I                  e                  f                                ⁡                                  (                                                                                    f                        →                                            x                                        ,                                          f                      t                                                        )                                            =                                                (                                                                                    I                        m                        f                                            ⁡                                              (                                                                                                            f                              →                                                        x                                                    ,                                                      f                            t                                                    ,                                                                                    +                                                              v                                →                                                                                      ·                                                                                          f                                →                                                            x                                                                                                      )                                                              *                                                                  Λ                        f                                            ⁡                                              (                                                                                                            f                              →                                                        x                                                    ,                                                                                    f                              t                                                        +                                                                                          v                                →                                                            ·                                                                                                f                                  →                                                                x                                                                                                                                    )                                                                              )                                ·                                                      A                    f                                    ⁡                                      (                                                                                            f                          →                                                x                                            ,                                                                        f                          t                                                +                                                                              v                            →                                                    ·                                                                                    f                              →                                                        x                                                                                                                )                                                                                                                          =                                                (                                                                                    I                        c                        f                                            ⁡                                              (                                                                                                            f                              →                                                        x                                                    ,                                                      f                            t                                                                          )                                                              *                                                                  Λ                        f                                            ⁡                                              (                                                                                                            f                              →                                                        x                                                    ,                                                      f                            t                                                    ,                                                                                    +                                                              v                                →                                                                                      ·                                                                                          f                                →                                                            x                                                                                                      )                                                                              )                                ·                                                      A                    f                                    ⁡                                      (                                                                                            f                          →                                                x                                            ,                                                                        f                          t                                                +                                                                              v                            →                                                    ·                                                                                    f                              →                                                        x                                                                                                                )                                                                                                                    (          9          )                )            
The perceived image Ipf({right arrow over (f)}x,ft) after low-pass filtering by the eye is shown in the third plot of FIG. 6 for an impulse-type display, and in the fourth plot of FIG. 7 for a hold-type display, wherein the plots of FIGS. 6 and 7 complement the plots of FIG. 4, respectively. The image after the eye low-pass is obtained by only looking at the frequencies ft≈0, again assuming perfect reconstruction in the spatial domain. There we can see that the effect of the temporal aperture function of the display, combined with eye tracking, can be described as spatial filtering of moving images:
                                                                                          I                  p                  f                                ⁡                                  (                                                            f                      →                                        x                                    )                                            =                                                                    I                    c                    f                                    ⁡                                      (                                                                  f                        →                                            x                                        )                                                  ·                                                      A                    f                                    ⁡                                      (                                                                                            f                          →                                                x                                            ,                                                                        v                          →                                                ·                                                                              f                            →                                                    x                                                                                      )                                                                                                                          =                                                                    I                    c                    f                                    ⁡                                      (                                                                  f                        →                                            x                                        )                                                  ·                                                      H                    f                                    ⁡                                      (                                                                  f                        →                                            x                                        )                                                                                                          (        10        )            
with the spatial low-pass filterHf({right arrow over (f)}x)=sin c(π{right arrow over (v)}·{right arrow over (f)}xTh).  (11)
The filter Hf({right arrow over (f)}x) of Eq. (11) depends on the speed of motion {right arrow over (v)} and the hold time (frame period) Th.
FIG. 8 schematically depicts the amplitude response of this filter as a function of motion (speed) |{right arrow over (v)}| (in pixels per frame) and normalized spatial frequency fxΔx along the motion direction
                    f        →            x        ·                  v        →                                      v          →                              ,wherein the white region represents amplitudes between 1 and 0.5 (low attenuation) and wherein the shaded region represents amplitudes between 0.4 and 0 (high attenuation).
Although the temporal “hold” aperture is beneficial with respect to large area flicker, it will cause a spatial blurring of moving objects on the retina of the viewer. Higher spatial frequencies will be attenuated by the sinc characteristic, and the spatial frequency from which the attenuation starts will get smaller with increase with speed, thus affecting an extended spatial frequency region. Furthermore, this blurring will only occur along the motion direction. The sharpness perpendicular to the motion of each object is not affected.
Eq. (11) suggests that, in order to decrease this effect, the hold time Th must be decreased. This can be achieved in two ways. First of all, the frame rate can be increased. In order to have the required effect, this must be done with a motion-compensated frame rate conversion, since a simple frame repetition will result in the same effective hold time. Secondly, without changing the frame rate, we can decrease the period (or better: duty-cycle) of light emission. For LCDs, this can be realized by switching the backlight on only during a part of the frame time, using a so-called “scanning backlight”.
A third option for decreasing motion blur due to the sample-and-hold effect, based on Eq. (11), is to use only video processing, and does not require modification of display or backlight. The low pass filtering of the display+eye combination 903 (consisting of reconstruction 901 by the display and tracking/low-pass filtering 902 by the viewer/eye) is pre-compensated in the video domain, as shown in the display chain 9 of FIG. 9. This can be achieved by using the inverse filter 900 of the filter Hf({right arrow over (f)}x) of Eq. (11):
                                          F            inv            f                    ⁡                      (                                          f                →                            x                        )                          =                  1                      sinc            ⁡                          (                              π                ⁢                                  v                  →                                ⁢                                                                  ⁢                                                      f                    →                                    x                                ⁢                                  T                  h                                            )                                                          (        12        )            
The inverse filter Hinvf({right arrow over (f)}x) is a purely spatial filter, reflecting the observation that the temporal aperture of the display, combined with eye tracking, results in a spatial low-pass filter Hf({right arrow over (f)}x). The cascade 9 of the inverse filter 900 and the display+eye combination 903 further along the chain should result in a perceived image that approaches the original image as well as possible.
EP 0 657 860 A2 discloses the use of an approximation {tilde over (H)}invf({right arrow over (f)}x) of such a pre-compensation filter Hinvf({right arrow over (f)}x) 900 in the shape of a speed-dependent high spatial frequency enhancement filter (or high spatial frequency boosting filter), which enhances the spectrum of the video signal at high spatial frequencies according to the speed of the moving components, wherein said spectrum at high spatial frequencies is related to moving components in the images of the video signal. Therein, the cut-off frequency of the spatial frequency enhancement filter (from which on the enhancement starts) is adjusted according to motion vectors that are estimated by a motion vector estimator. The spatial frequency enhancement filter {tilde over (H)}invf({right arrow over (f)}x) deployed in EP 0 657 860 A2 is not the exact inverse filter Hinvf({right arrow over (f)}x) as defined in Eq. (12), because the restoration of those frequencies which have been attenuated to very low levels (for instance in the zeroes of the spatial low pass filter Hf({right arrow over (f)}x) of Eq. (11)), e.g. below noise thresholds, can not realistically be achieved.
FIG. 10 depicts the transfer function of the spatial low-pass filter Hf({right arrow over (f)}x) 1000 of Eq. (11), of the inverse filter Hinvf({right arrow over (f)}x) 1001 of Eq. (12), and of an approximation 1002 of the inverse filter Hinvf({right arrow over (f)}x) of Eq. (12) as a function of the spatial frequency, wherein said approximation 1002 is similar to the high spatial frequency enhancement filter of EP 0 657 860 A2.
Spatial frequency enhancement filters as disclosed in EP 0 657 860 A2 also enhance the high spatial frequency components of noise that is present in the sampled images of the video signal. However, in flat (undetailed) image parts, the motion estimator has a high probability of estimating the wrong motion vector that determines the cut-off frequency of the spatial frequency enhancement filter, resulting in undesirable noise amplification at high spatial frequency enhancement filter gains, which significantly degrade the quality of the images of the video signal.