1. Field of the Invention
The invention relates generally to image processing. More specifically, the invention relates to image processing suited to bandwidth constrained applications such as videoconferencing.
2. Description of the Related Art
A digital image of a scene/environment has a particular size which is defined by the number of rows and columns of pixels (individual color/intensity points) that it contains. The image size or xe2x80x9cresolutionxe2x80x9d is thus expressed as the number of columns multiplied by the number of rows. For instance, an image with a resolution of 768xc3x97576 has 768 columns and 576 rows of pixels for a total of 442,368 pixels.
Often, the original size of an image as captured by an imaging device such as a camera or as later represented is too large for a particular application. While a larger resolution image contains more image information (more pixels per area) and is likely of a more desirable visual quality than a lower resolution image, bandwidth, memory and other constraints may dictate that a lower resolution image be used. For certain devices, such as digital cameras, it may be desirable to reduce the device""s overall cost by utilizing a smaller resolution image so that the required storage component in the device is also smaller. In the context of videoconferencing, for instance, certain standardized image formats such as QCIF (Quarter Common Intermediate Format) have been defined so that receiving and transmitting nodes do not have to be concerned with converting discordant image sizes. In videoconferencing, it is often desirable to maintain a certain xe2x80x9cframexe2x80x9d rate (the rate at which individual image frames are received and/or rendered for output). To maintain this frame rate, formats such as QCIF have been defined which are typically smaller than most captured digital image sizes, particularly those captured from certain digital cameras. Since an image may not be originally the same resolution as that desired by a particular application, a process known as image scaling is employed. When an image is scaled xe2x80x9cup,xe2x80x9d its size is increased and when it is scaled xe2x80x9cdownxe2x80x9d its size is reduced. Hereinafter, when the application refers to xe2x80x9cscalingxe2x80x9d or xe2x80x9cscaled imagexe2x80x9d, down scaling or reduction in image size is the intended meaning and usage of those terms.
The scaling of an image should be distinguished from image cropping, where the resolution is reduced by cutting out a portion of the image. Scaling implies that while the size of the image is reduced, the entire scene/environment in the unscaled image (hereinafter variously referred to as xe2x80x9coriginalxe2x80x9d or xe2x80x9cunscaledxe2x80x9d image) is maintained in great majority. The scene from the original image remains complete but is represented in a lower resolution after scaling.
Image scaling has been achieved in the art in several ways. The most common scaling technique averages pixels in particular image region in equal weighting and then xe2x80x9cdecimatesxe2x80x9d or throws away entire pixels in the region, thereby generating a pixel in the scaled image. The averaged pixel replaces an entire region of pixels, with the replaced region not necessarily the same size as the averaging region. For instance, consider a 2:1 scaling procedure where each two by two region of pixels in the original image is to be replaced by a single pixel in the scaled image. When determining the value of the scaled image pixel, it may be desirable to average together a larger region than the 2 by 2 region of replacement, such as a 3 by 3 neighborhood. In such an instance, the xe2x80x9csamplingxe2x80x9d region (3xc3x973) is said to be larger than the xe2x80x9cscalingxe2x80x9d region (2xc3x972) and may be useful in ensuring that more of the image is considered so that features that start in the scaling region and bleed over past the scaling region are given the proper consideration. An averaging method where each pixel in the sampling region is given equal weight however, is deficient in several regards. Primarily, the equal averaging of pixels has the effect of losing much of the original image information. Equal weight averaging does little to identify image features, since it treats all parts of the image region identically and then decimates all pixels.
To overcome this loss of information, certain image filtering mechanisms have been developed. For instance, for a 4:1 scaling procedure(where every 4 columns and 4 rows of pixels is replaced by a single pixel), a filter with 7 xe2x80x9ctapsxe2x80x9d or coefficients has been developed. The seven-tap filter is applied to seven rows and columns of pixels that center about and include a center pixel. Typically, the filtering coefficients used are {1,2,4,8,4,2,1} such that the first pixel of a row or column sampled is weighted (multiplied) by 1, the second pixel in any row and column by 2 and so on. Notably, this typical filter design provides that a disproportionate weight be assigned to the center pixel (8 times that of a pixel at the beginning or end of the sampling region). Likewise, a great weighting (4 times that of beginning or end pixels) is given to pixels which are immediately adjacent to the center pixel. The assumption behind this weighting is an arbitrary one but has important consequences. Specifically, edge features, if they do not lie near the center of the sampling region where the weighting is highly disportionate to other areas, will be blurred or entirely lost because their intensity values are not prominent in the weighting.
Edges, which are defined by the abrupt differential of intensity/color from one image area or pixel to next, must have their intensity/color values properly weighted or represented in the sampling. If an edge feature does not lie in the center of the sampling region for the seven-tap filter, but rather in a corner or side of the sampling region, it will be weighted only xe2x85x9 or xc2xc of the amount given to a pixel near the center (if applying the {1,2,4,8,4,2,1} seven-tap filter). The non-edge pixels in the center of the sampling region will dominate and thus, the edge feature will be blurred or entirely lost. If the edge feature lies in the center of the sampling region, the pixels that comprise the edge will be adequately represented. Since it is impossible to predict exactly where edges are present in an image even with edge detection schemes, statistically, the typical seven-tap filter is deficient in preserving edge features in the scaled image. While a seven-tap filter does enlarge the sampling area for a 4:1 scaling procedure, it does so at the expense of edge preservation. Thus, there is a need for a scaling technique that will better preserve edge information in scaled images.
Furthermore, with regard to implementation, if scaling is to be implemented in hardware such as a CMOS (Complementary Metal-Oxide Semiconductor) imaging device, it is important to reduce the computational complexity of the scaling procedure, especially when many other functions must also be carried out by the device. When an imaging device is used to transmit image frames (a sequence of individual still images) for the purpose of videoconferencing, the transmission must be fast enough to maintain the frame rate and be compatible with the bandwidth capability of the interface between the imaging device and the processing device (computer system) that is used to package and transmit the captured image frames to the destination node. In devices that are dual-moded, which may provide both motion and still imaging, there are also desired methods and apparatus that can readily provide different levels of scaling interchangeably.
A method for scaling of an image that includes providing an N-tap filter to be applied upon an image region and applying the N-tap filter variously to the image region and obtaining therefrom at least one scaled image pixel, such that N is equal to the scaling factor plus one.