1. Field of the Invention
The present invention relates to a method and an apparatus for converting the resolution of an image consisting of a set of pixels that are represented by digital values.
2. Description of the Related Art
So far, it has been assumed that such an image that is taken from real world to a photo and digitized is the object of resolution conversion, and that a change of pixel values is continuous and a discontinuous boundary has a sufficient size. Most conventional image expansion techniques aim at natural images, which rarely have stepped edges such as thin lines in themselves. This is mainly because information is recorded, while the stepped edge is transformed, by the lowpass effects of an image pickup apparatus, into a form represented by a sigmoid function (a differentiable, continuous function represented by f(x)=1/(1+(exe2x88x92x)). Therefore, so far, improvements have been made to find how to make the outline, which would become blurrier than necessary, appear natural, with the assumption that the original image was obtained by sampling a lowpass filtered image.
For resolution conversion which requires the least computations and can be easily implemented, the replica and nearest neighbor methods are pointed to among conventional ones. The replica method is the simplest one by which an image can be expanded (n+1)/n times by copying the same pixels every n-th pixel. On the other hand, the nearest neighbor method is the one by which an image is expanded by copying the pixel of the original image that is closest to the coordinates obtained after the resolution conversion. Both the replica method and the nearest neighbor method provide substantially the same effects; i.e. the mixing of colors among pixels does not occur, and color tones are completely held.
Also, two methods, i.e. bilinear interpolation and bicubic (cubic convolution) interpolation are well known for the conversion of resolutions. With bilinear interpolation, the coordinates of a pixel point for a resultant image are inversely mapped to the coordinates of the original image. Then, the adjacent pixels of the original image (four surrounding points, two points on both sides, or only one point located at the identical coordinate) are weighted, by using distances, to obtain an average value, which is subsequently regarded as the color of the resultant image. On the other hand, bicubic interpolation is extended up to 2 rounds (16 points) in the vicinity based on the same idea as bilinear interpolation. According to this method, it is assumed that the one-order differential continuity and the change of values (i.e. slope) of the original image are sufficiently moderate. Colors are enhanced by the weights of parameters, which provides the advantage of making them clearer than bilinear method.
In addition, a method that generalizes the above interpolation methods and that requires much more computations, is identified as multirate system. In general, a digital signal processor (DSP) is required in order to implement this multirate system. The basic framework of the multirate system is to apply lowpass filtering after up-sampling with zero-value interpolation, and further to apply down-sampling with decimation to obtain a predetermined expansion factor. This framework can theoretically comprehend the bilinear interpolation and the bicubic interpolation by using the frequency characteristics of the lowpass filter. Practically, implementations called polyphase configuration or filter bank configuration are often used in order to reduce the computational load.
However, according to the replica method and the nearest neighbor method, the pixel of an original image is simply held at a new spatial sampling point, and the expanded line width differs depending on the coordinate positions. Since the sensitivity to frequency components of human eyes is high to low frequencies for the angle unit, a serious problem occurs in the readability of a line width in these methods.
Further, according to the bilinear interpolation method, when the coordinates that are determined for an inversely projected image are located at the same distance on either side of a one-dot width line, the resultant image will invariably represent a line as wide as two lines of a half a color. Therefore, problems occur relative to the uniformity of colors, the legibility of characters and the reproduction fidelity of colors. Although an image in a photo may appear to be satisfactory, the overall impression is of a blur.
Furthermore, the bicubic interpolation method has the same problem as does bilinear interpolation method, where one line is changed into two lines half a color. Thus, the accurate reproductivity of colors of the screen image of a personal computer (hereinafter referred to as a PC) is still problematic. Also, there is a slight occurrence of ringing at sharp boundaries between middle tone colors.
Moreover, when any given lowpass filter is employed for the above multirate system, checkerboard distortion may appear in a resultant image, thereby imposing constraints upon filter designs. That is, constraints are placed on filter designs depending upon the images that are to be processed and required filter conditions, such as, the filter characteristics for a passing band or blocking band must be flat, or a filter for down-sampling be separately provided in order to introduce an average operation. It is preferable that an FIR (Finite Impulse Response) filter be used to maintain the linear phase of an image, however, typically a higher order filter is required to obtain an image quality that is higher than what is acquired by the bilinear or bicubic interpolation method. So it is believed necessary to use a dedicated DSP. In addition, as a problem associated with using the higher order filter, what has a thin line structure as a font, is expanded more than necessary to appear.
In U.S. Pat. No. 5,594,676, a method for scaling is proposed wherein the filtering ranging from 3xc3x973 to maximum 65xc3x9765 is performed by changing the number of pixels used in the interpolation according to the position of a pixel. However, on a PC screen subject to the present invention, since there are a number of changes of a stepped edge with one pixel width, the above prior methods can not perform a proper resolution conversion. Now let""s consider the reason in terms of a multirate system.
FIG. 13 depicts a multirate system performing U/D-fold sampling-rate conversion. First, converting an input x[n] comprising n sample points to nxc2x7U data string using an up-sampling( ↑U) 201, then passing through a lowpass filter 202, and finally obtaining m=nxc2x7U/D resultant data string y[m] using down-sampling(↓D) 203. In up-sampling, Uxe2x88x921 zero values are added per one of x[n] data.
FIG. 14 depicts an example of one-dimensional signals using U=3 for up-sampling. An upper left diagram shows the input string, while an upper right diagram shows the first twenty of data string that were increased three times with the interpolation of zero values. A horizontal axis of these diagrams represents a discrete sampling number, while a vertical axis represents an amplitude (level). A lower left diagram depicts a result of x[n] after FFT (Fast Fourier Transform), while a lower right diagram depicts a result of up-sampled 3xc2x7n data string after FFT. A horizontal axis of these diagrams represents a normalized frequency [Hz], while a vertical axis represents a spectral amplitude represented by square of an absolute. Both of them normalize the frequency to [0,1], wherein 0 corresponds to a lower frequency. As can be seen in the lower right diagram, the frequency components of an original image is reduced to one U-th., while at the same time imaging components appear in the higher frequency domain.
FIG. 15 depicts an example of one-dimensional signals using D=3 for down-sampling. A horizontal axis and a vertical axis are the same as those shown in FIG. 14. In downsampling, decimation is performed to retrieve one from nxc2x7U data string at every D data. In this down-sampling example, an upper left diagram shows not the up-sampled result, but any given input string. An upper right diagram shows data string decimated to one third by picking up the input string every three points, which is comprised of triple string of points shown in the upper left diagram. The lower left and lower right diagrams are respectively what normalizes the FFT results. After one D-th. of decimation has been done, it is seen that the frequency components extend D times and that the gain decreases to one D-th. However, if the input string to this down-sampling has a frequency component higher than 1/D, the expanded frequency components exceed 1 so that aliasing components occur in the resultant spectrum. Therefor, in order to preserve frequency components of an original image without being affected by imaging components and aliasing components and to make U/D-fold sampling-rate conversion, a lowpass filter (H(z)) 202, which has a passing band of [0, min(fsx/U, fsx/D)], must be disposed between an up-sampling and a down-sampling, where fsx is half the sampling frequency of the image displayed at resolution x.
That a frequency component of an image exists even at fsx without deterioration compared with the other components, means that a change of fonts or the like, which are used on an information display screen, can represent a steep signal in almost ideal form at a given resolution. Namely, if an fsx component and its nearby frequency components attenuate, the display screen is involved in blur. Assuming that frequency components range from 0 to fsx, spectral components that an image (resolution y) owns after the sampling-rate conversion with the multirate processing has been done, range from 0 to Dfsy/U with keeping the original spectral profile, wherein since considering the expansion of an image now, this gives D/U less than 1, as a result frequency components from Dfsy/U to fsy become zero. Conventional methods such as bilinear interpolation and bicubic interpolation can be comprehended in the framework of this multirate processing, and a discussion can be resolved into the difference of frequency characteristics of the lowpass filter, however, as long as within a framework of multirate processing, frequency components higher than Dfsy/U can not be generated, thus for an image whose original image has steep changes of a computer screen or the like, the expansion is difficult while preserving stepped edges. Particularly in a PC screen, an image adequately has frequency components up to fsy essentially independent of the resolution. Therefor, given a lack of this part, a blurring impression is given, which is considered to be a more difficult problem than expansion of natural images.
Summarizing the above, as is evident from the existence of a lowpass filter, as the result of expansion processing in the framework of the multirate system, there are only frequency components lower than the sampling frequency which are expressible at the resolution after the expansion. Therefor, it is difficult to preserve stepped edges, thus they would become blurry to be sigmoid and as a whole, an image with a blurred impression would be brought. To cope with the above problems, several methods have been proposed to improve a blurred outline in the bilinear or bicubic interpolation. However, they have the following problems.
In an article Proceedings of the 1999 IEICE General Conference D-11-120 a method for detecting an edge of an original image, and reconstructing an edge having a slope that corresponds to that of the detected edge for the resultant image is proposed. This is a method regarded as reprocessing a part of high frequency components, where the jaggies that are generated due to the expansion of an oblique line can be replaced on the resultant image with fine steps. According to this method, however, a 3xc3x973 sized filter, at the least, must be employed to detect the edge, and the fine structure of fonts can not be expanded. As is shown in FIG. 16, for example, protrusions of a transverse line of a letter xe2x80x9ctxe2x80x9d are converted to a rounded bulge.
In an article IEEE Proceedings of International Conference on Image Processing 1997, vol. I, pp. 267-270, October 1997, a method is described wherein linear expansion is employed after an edge enhancement processing for an original image has been done. Though this method consequently augments high frequency components, it assumes that there are boundary widths that can be detected by an edge detection filter, thus for an object smaller than the size of the edge detection filter, it is less effective. In addition, unless shaping is applied for an oblique boundary, stair-stepping jaggies would be produced apparently.
Further, in U.S. Pat. No. 5,717,789, a method for expanding an image is proposed, wherein high frequency components are estimated by utilizing the fact that zero crossing positions of Laplacian components are constant independent of a sampling rate, when representing an original image by a Laplacian pyramid. However, this technique can only address an expansion with integral multiples.
To solve the above technical problems, it is an object of the present invention to implement a fast processing in resolution conversion, such as image expansion at high resolution. It is another object of the invention to obtain a clear scaled image as a result of resolution conversion processing, without impairing the rough shape of fonts or the like, even on a screen like PC screen that includes a lot of stepped edges.
To achieve the above objects, according to the present invention, an image conversion method for performing a resolution conversion on original input image data to generate an image scaled by rational number or integral number of times, comprises the steps of: generating first converted image data, wherein connect relations are maintained for a line width included in the original image data even after the resolution conversion; generating second converted image data, wherein the resolution conversion is performed on the original image data while maintaining a rough shape of this line width; and mixing the first and second converted image data in a given ratio.
This mixing step comprises mixing the first and second converted image data in different ratios depending upon a coordinate position on a screen to display the original image data. Further, instead of this mixing step, the method may include a step of generating an image from the generated first and second converted image data.
Further, according to the present invention, an image conversion method for converting original input image data into expansion image data that is expanded less than twice, comprises the steps of: generating first expanded image data, wherein for a one-pixel width line included in the original image data, one pixel width and its connect relations are maintained even after the expansion of less than twice; generating second expanded image data, wherein an expansion processing is performed on the original image data using linear interpolation; and mixing the first and second expanded image data to generate the expansion image data.
The mixing of the first and second expanded image data comprises determining a weight based on a criterion function concerning the spectral power of the original image data, and mixing the first and second expanded image data based on the determined weight, thereby the weights are advantageously determined automatically depending on an image.
The weights can also be set to constant values based on empirical rules, further if the weight xe2x80x9caxe2x80x9d is determined in the range of 0.5-0.9, for example, according to a kind of image, a position of image or user""s settings, an expanded sharp image for a one-pixel width line can be obtained and jaggies occurring on a straight line or curved line are preferably moderated. The method may further include the step of inputting user""s preferences as adjustment values, and wherein the mixing of the first and second expanded image data comprises mixing the first and second expanded image data based upon the input adjustment values, thereby the determination of relative merits about image quality finally comes down to a user""s preference.
In another aspect, an image conversion method according to the present invention comprises the steps of: inputting original image data to be expanded at a predetermined scaling factor; when a line width included in the original image data is expanded less than or equal to an integral number of times not exceeding the scaling factor, generating an expanded image while preserving connect relations and color of each pixel composing the line; introducing a predetermined blurring into the expanded image to alleviate jaggies formed in the lines of the expanded image, and outputting the expanded image, wherein jaggies formed there are alleviated.
Here, the jaggy means the situation where an oblique line and curved line or the like are notching. Alleviation of jaggies is done by an interpolation processing for preserving the rough shape of lines composing the original image data. More specifically, an edge is blurred by linear interpolation such as bilinear interpolation and bicubic interpolation.
Further, according to the present invention, an image processing apparatus comprises: input means for inputting original image data to be scaled; linear image conversion means for performing a scaling processing on the input original image data to generate a linear resultant image which lacks spectral components of a high frequency domain; nonlinear image conversion means for performing a scaling processing on the input original image data to generate a resultant image which compensates for the spectral components in the high frequency domain lost by the linear image conversion means; and mixing means for mixing the resultant image generated by the linear image conversion means and the resultant image generated by the nonlinear image conversion means to generate scaled image data.
This nonlinear image conversion means generates a resultant image which preserves the connect relations and colors of lines, even when a line width included in the original image data is expanded less than or equal to an integral number of times not exceeding a scaling factor, thereby enabling to compensate for the spectral components in the high frequency domain in much the same way as the original image data.
Further, this nonlinear image conversion means comprises: a detector for detecting connect relations of a target pixel in the original image data in accordance with a value of the target pixel and values of neighboring pixels; an expanded width detector for determining whether the width of the target pixel can be expanded based on the connect relations that are detected by the detector; and a rule application unit for, according to a relation between a coordinate position of a pixel in the original image data and a coordinate position of a resultant pixel in the scaled image data, defining a plurality of types into which the relation is classified, and applying specific rules for the plurality of types that are defined, based on the output from the expanded width detector.
Further, according to the present invention, an image processing apparatus for converting original input color image data having low resolution into expansion color image data having high resolution, comprises: width examination means, for comparing a target pixel in the original color image data with image data that are composed of peripheral pixels to determine whether the target pixel is a pixel having a small width or a large width; determination means, for determining a value for a resultant pixel that constitutes the expansion color image data, wherein a small width is maintained when the width examination means determines that the target pixel has the small width, or a large width is further expanded when the width examination means determines that the target pixel has the large width; and blurring means for introducing a blurring at a predetermined rate into the resultant pixel which was determined by the determination means.
Here, the expansion color image data is obtained by increasing less than twice the resolution of the original color image data. The width examination means then determines whether a pixel having a small width is a pixel that constitutes a one-pixel width line, and the determination means determines the value of the resultant pixel, so that the one-pixel width is maintained for a corresponding line of the expansion color image data. According to this configuration, a mixture of one-dot width and two-dot width lines can be avoided, thereby providing a very conspicuous and sharp image for users.
The image processing apparatus further comprises: the color examination means, for examining the color of the target pixel in the original color image data, and the determination means, for determining the value of the resultant pixel by using unchanged, without mixing with the color that is identified by the color examination means, the color of the target pixel. For example, in case of a font composing a thin line, a shape of a character is hard to recognize unless pixels to be linked are in the same gradation and tone. However, according to the above configuration, a skeleton line is secured in the same color, thereby enhancing legibility.
Further, according to the present invention, an image processing apparatus for applying a resolution conversion on original image data to generate a converted image scaled by rational number or integral number of times, comprises: an original image data input unit for inputting the original image; a nonlinear processing unit for generating first converted image data, wherein connect relations are maintained for a line width included in the original image data even after the resolution conversion; a multirate processing unit for generating second converted image data, wherein the resolution conversion is performed on the original image data while maintaining a rough shape of the line width; and a mixing unit for mixing the first converted image data generated by the nonlinear processing unit and the second converted image data generated by the multirate processing unit.
Further, the image processing apparatus includes a weight determination unit for determining a mixing ratio of the first converted image data and the second converted image data, wherein the mixing unit mixes the first and second converted image data according to the mixing ratio determined by the weight determination unit.
In addition, the weight determination unit may determine the mixing ratio individually for each pixel to which the resolution conversion was performed. In such a case, for example, a mixing ratio can be changed for a portion in a screen where a change occurs, so it can alleviate a limping phenomenon when dragging an icon or something in GUI.
In case where the image processing apparatus automatically does weight settings, getting spectral power (spectral average power) on a predetermined block basis, for example, a local weight can be set based on an image position. Alternatively, calculating a one-dimensional spectrum for diagonal elements of an image, for example, calculated amount can be advantageously reduced compared with a two-dimensional spectrum.
Further, according to the present invention, an image processing apparatus for converting original image data into expansion image data that is expanded less than twice comprises: input means for inputting original image data; a first processing means, for a one-pixel width line in the original image, for generating an expanded image of one-pixel width while preserving connect relations and color of each pixel composing the line; a second processing means for introducing a predetermined blurring into the expanded image of this one-pixel width to alleviate jaggies formed in the expanded image of this one-pixel width.
This second processing means comprises a linear interpolation processing for preserving the rough shape of one-pixel width lines composing the original image data.
Further, according to the present invention, an image display apparatus for scaling original image data by rational number or integral number of times to display converted image data to a window screen comprises: original image data input means for inputting original image data; linear image conversion means for performing a scaling processing on the input original image data to generate a linear resultant image which lacks spectral components of a high frequency domain; nonlinear image conversion means for generating a resultant image which compensates for the spectral components in the high frequency domain lost by the linear image conversion means; mixing means for mixing the linear resultant image generated by the linear image conversion means and the resultant image generated by the nonlinear image conversion means to generate converted image data; and display means for displaying the converted image data to the window screen.
This original image data input means inputs original image data accompanying movement of images on the screen. Further, the mixing means generates converted image data while selecting the linear resultant image generated by the linear image conversion means when the original image data accompanying the movement of images are inputted by the original image data input means. Since in linear conversion a centroidal point moves evenly, limping and jumping that would occur upon movement can be alleviated.