Present day digital imaging devices, such as televisions, digital cameras, and camcorders, offer advanced features that allow a viewer to adjust the image displayed. For instance, the viewer can adjust the color balance of the outputted image, the image's brightness and/or contrast, and the image's resolution. One feature that is popular is a zoom-in/out feature that allows the viewer to magnify or shrink the displayed image. Also popular are picture-in-picture (PIP) and picture-on-picture (POP) features where the display is divided into multiple windows and each window displays a different image signal (i.e., channel), or image. Here, the sizes of the windows can vary according to the viewer's preferences. Accordingly, the degree to which the image is magnified or reduced, i.e., the scaling factor, also varies.
To convert an input image of one resolution or size to an output image of another resolution or size, each target output pixel value is a weighted sum of the pixel values of the input pixels within a convolution window surrounding the target output pixel. For most video standards, e.g., ITU-R BT.470, ITU-R BT.601, ITU-R BT.709, each pixel value of an input video signal is sampled according to a two-dimensional rectangular sampling grid. Similarly, for most display technologies currently in use, e.g., cathode ray tube (CRT), liquid crystal display (LCD), plasma display panel (PDP), digital micro-mirror device (DMD), and liquid crystal on silicon (LCoS), each pixel value of an input video signal is also displayed according to a two-dimensional rectangular grid on the screen of the display device. For these sampling and display systems operating in two-dimensional rectangular grids, separable two-dimensional interpolation filters are often used to reduce the implementation complexity of filtering functions in hardware, firmware, software, or a combination thereof.
A number of interpolation techniques can be used to calculate the value of the output pixel during image scaling. For example, a linear interpolation method calculates the output pixel value as a linear combination of the input pixel values. Typical linear interpolation systems can utilize a bilinear filter, a bicubic filter, a windowed sinc filter, or other similar filters, for performing image magnification and reduction. While simple, the linear interpolation systems can produce visual artifacts in the output image, such as blurriness in fine details, ringing along sharp edges, jaggy edges along diagonal sharp edges, fold-over frequencies, i.e., aliasing, in areas containing over Nyquist frequency components, and beat patterns, i.e., moiré, in areas containing near Nyquist frequency components.
Some of these visual artifacts are objectionable to viewers, especially when the output images are magnified and displayed in a high definition (HD)-capable display device. In particular, blurred details and edges of the output image is a common result of image magnification. These artifacts are exacerbated when the input image itself is blurry and dull, i.e., is not sharp. Generally, perceived sharpness of a displayed image depends on the relative contrast along the edge of an image. Particularly, an image is perceived to be sharper to the human eye as the relative contrast increases along the edges of the image.
In a conventional video transmission and display system, a video signal is passed through a high-frequency peaking filter such that a second derivative component is added to the video signal to enhance the relative contrast along the edges of the image. Nevertheless, if the video signal is then passed through a low pass filter during transmission or storage to reduce its bandwidth, the high-frequency components of the original video signal are attenuated or removed. Thus, if the video signal is scaled up to a higher resolution using a linear scaling method, where new frequency components are not generated, the high-frequency components will not be present at the higher resolution because no such frequency components are present in the original video signal. The resulting output image can exhibit undesirably smooth, as opposed to crisp, edges with inadequate perceived sharpness. This effect is particularly pronounced when an image is magnified.
Moreover, in a conventional video transmission and display system, the bandwidths of the two chrominance, i.e., color difference, channels of a color video signal are usually narrower than the bandwidth of its luminance channel, e.g., Y, in order to reduce the transmission or storage bandwidth. Thus, the high-frequency components of the chrominance channels are attenuated or removed even further. The chrominance channels have lower resolutions than the luminance channel, which results in smoother color edges of the displayed image with poor perceived sharpness.
For example, according to the NTSC analog color television standard, a 4.2 MHz luminance bandwidth is typical to provide a horizontal resolution comparable with the vertical resolution when displayed on a screen with 4:3 aspect ratio. In the chrominance channels, however, only a 1.3 MHz bandwidth for the I channel and a 500 kHz bandwidth for the Q channel are considered as acceptable. According to the ITU-R BT.601 digital component video standard, there are several luminance (Y) and chrominance (Cb and Cr) signal sampling formats, such as 4:4:4, 4:2:2, 4:1:1, and 4:2:0. Except for the 4:4:4 YCbCr format where the chrominance (Cb and Cr) signals have the same sampling rates and resolutions as the luminance (Y) signal, the chrominance (Cb and Cr) signals have sampling rates and resolutions one half or one quarter to that of the luminance (Y) signal in the horizontal and/or vertical dimensions. To display pixel data in such YCbCr formats with subsampled chrominance data, the video signal is first converted to 4:4:4 YCbCr format using interpolation to generate the missing Cb and Cr pixel data in the horizontal and/or vertical dimensions. Such interpolation usually results in a blurrier color edge of the displayed image.
In order to improve the perceived sharpness of an image, the high frequency components of each channel can be enhanced in order to increase the relative contrast along the edges of the image, which in turn, increases the perceived sharpness of the details in the image. In addition, because the smooth edges and blurred details in an image are due to slow transitions in the video signal at the edges, the image can be made sharper by shortening the transitions. The transitions in the video signal are referred to as luminance transients and chrominance transients. Various techniques can be used to shorten the luminance and chrominance transients in order to improve the perceived sharpness of an image without increasing the resolution. For example, in linear processing techniques, a high frequency detail signal is extracted from the input signal transitions and, after controlled amplification, is added back to the transitions without group delay error and with proper phasing. In nonlinear processing techniques, nonlinearly shortened transitions are generated from the original image transitions without introducing undershoot and overshoot ringing artifacts. In both cases, transitions in the resultant output video signal are shorter in duration, and thus the output image is perceptually sharper than the input image.
Typically, linear sharpness enhancement and nonlinear transient improvement are performed in series, i.e., the high frequency components of the video signal are enhanced and then the transients improved, or vice versa, after an image has been magnified or reduced. For best results, sharpness enhancement and transient improvement should be performed horizontally and vertically, i.e., in two dimensions. Thus, each operation utilizes its own set of line buffers to store pixel data in more than one scan line so that two dimensional image processing can be performed.
A display system that includes an image scaler module, an image sharpness enhancement module, and a transient improvement module can generate magnified images that have crisp edges and sharp details in both horizontal and vertical dimensions. Nonetheless, such a system is often considered impractical because each component requires its own set of pixel buffers and line buffers. Such pixel buffers and line buffers increase the complexity and cost of the system. Moreover, reading and writing image pixel data from and to the buffers consume system resources such as power consumption and memory bandwidth. This additional cost and complexity is prohibitive for less expensive display systems and therefore, images displayed by such systems are typically blurry and soft at the edges in either or both of the horizontal and vertical dimensions.
Accordingly, it is desirable to provide a method and system that is suitable for enhancing sharpness and improving transients during image magnification or reduction. The method and system should be cost effective and computationally efficient. In addition, the method and system should produce an output image substantially without blurriness, ringing, jaggy edges, aliasing, moiré, and other visual artifacts over a wide range of image magnification and reduction factors.