1. Field of the Invention
The present general inventive concept relates to image processing, and more particularly, to a method of changing size/format of a digital image which can be used in editing an image.
2. Description of the Related Art
The technologies and methods relating to known decisions in the field of the image processing include linear scaling, framing, reformatting (retargeting), seam carving, and retouching.
The scaling is generally used to reduce and blow up photos and also is the base operation in processing images which is used at the operations demanding turn, distortion, affine transformations, deinterlacing, and increase of a resolution by time for video sequences. In most cases, the scaling is based on interpolation that finds an intermediate value using an available discrete set of known values.
Above all, the interpolation is used to generates a full-color image from an interleave transfer sensor of a digital camera and a video camera. At the present time, a set of various methods of interpolating an image with various complexities and productivities is provided.
General methods, such as bicubic and bilinear, introduce undesirable artifacts in the image in the field of sharp changes of brightness. Typical artifacts include edge dentation and tailing appearing on the assumption of smoothness of the image, and Gibb's effect shown as a result of exception of high frequency components of a spectrum of the image.
Usually, the two dimensional interpolation is made up a series of one dimensional interpolations in mutually perpendicular directions. For example, U.S. Pat. No. 6,915,026 assumes increase of an image in two stages, i.e., preliminarily calculating and saving coefficients for interpolation first in a vertical direction and then in a horizontal direction. This method has disadvantage that all one dimensional interpolations are made in parallel to coordinate axes of the image and the shown artifacts detect an underlying coordinate grid. Some method assumes a preliminary low frequency filtration in a diagonal direction, but unfortunately, it worsens image quality. U.S. Pat. Nos. 5,719,967 and 6,606,093 describe special methods for suppressing dentation of edges in the image.
The solution described in U.S. Pat. No. 5,446,804 is based on preliminary calculation of sub-pixel map of edges. For each interpolated pixel, there are four nearest neighbors. If any one of next pixels is not cut by edge from the others, a required value is calculated by a sum of these neighbors having weights. If the neighbors are laid on the different sides from the edges, an intermediate value is calculated for a required pixel using two diagonal pixels lying on one side of the edge which is selected based on the map. The required value in a position lying on the other side of a border is replaced with a value received by neighbors and lying on one side of the edge and a calculated size. Such decision allows receiving an image with a sharp border but containing dentate edges.
The methods listed below are to overcome the effect of dentate edges by not using interpolation along the axes lying on coordinate gird and also making two dimensional increase in a vertical direction and in a horizontal direction. The first method described in article Li, X., and M. Orchard, “New Edge Directed Interpolation,” IEEE International Conference on Image Processing, Vancouver, September 2000, is based on the assumption of existence of geometrical dualism between covariations co variations of images with a low and high resolution. Protection of edges on the image is caused by adaptability of values of coefficients to interpolation of random directed edge step. This method is intended to increase an image in 2 times. The interpolation includes two processes. First, a pixel having coordinates (2i+1,2j+1) is calculated using a known value in a position (2i,2j). Then, the similar procedure with turn of 45 degrees is made on the other pixels. Each pixel is recognized as a linear combination of the nearest four neighbors and the problem of calculation of coefficients of a linear combination is solved by the method of the least square. This interpolation method provides acceptable results, except for area having a structure in which assumption of geometrical dualism is not hold.
The second method described in article, Yu, X., Morse, B. and T. Sederberg, 2001, Image Reconstruction Using Data-Dependent Triangulation”, IEEE Computer Graphics and Application, vol. 21 No. 3, pp. 62-68, achieves a triangulation of a surface in consideration of a brightness value of point height above some zero level. Authors of this method construct a grid of the triangulation and restore an image linear interpolation of intensity inside a triangle. This approach is based on a data dependent triangulation (DDT) method along with different optimization and criterion function. Accordingly, the triangulation protects an edge (sharp change in brightness) and improves image quality. However, the triangulation is iteratively performed and thus is complex in view of calculation.
Sometimes, increasing an image includes two stages. First, an unknown brightness value is calculated and then a postprocessing of an image is performed to increase sharpness of the image and improve the edge. For example, U.S. Pat. No. 6,714,688 describes the postprocessing of the image to increase sharpness. First, bilinear interpolation of an image is performed and then a postprocessing of the image is performed using a moving window. The size of the window is detected from the degree by which the window increases. First, a low frequency filter is applied to an inner pixel of the window to select a high frequency component of the image. After that, the high frequency components are summarized with low frequency components having some coefficients detected based on local brightness characteristics. Then, the edges are additionally processed using a special curve indicating change in brightness.
The method described in EP Patent Application No. 1533899 relates to changing a size of a digital image using interpolation along borders and additional postprocessing. First, the image is enlarged in 2n times and then is reduced to a required size. An additional postprocessing of edges is performed. This method provides good results but is complex in view of calculation.
Also, methods of intellectual cropping or framing of an image (photo) are well known, which are intended to change a ratio of geometrical sizes (a ratio of sides) of the image, for example, a ratio of width to height, by cropping bottom and/or top (left and/or right) parts of the image. The term “intellectual” means that a copping of a photo is performed based on analysis of its contents with the purpose to exclude a cropping of an important object captured from a photo. The requirements for change in a ratio of outer sides of the photo appear on a user of a digital camera who wishes to print the digital image. The general digital photo has a ratio of sides of 4:3, wherein a standard sheet of photographic paper has a ratio of sides of 3:2.
There are two approaches to the decision of problem of cropping of a photo, not relating to the analysis of its contents. The first approach is that top and bottom edges of the photo are cropped in the proportion of 50% to 50% or 20% to 80%. Accordingly, if the height of the photo is reduced by 1 cm, horizontal stripes of 5 and 5 mm or 2 and 8 mm in height are cut from the above and the bottom. In many cases, this approach does not cause a cropping of an important object in the photo located at a center of the picture. However, if an object of shooting—for example, a person—is close to the edge of the photo, this approach may cause cropping of parts of the face, the head, or the other parts of the image of the person.
The other approach is to print a photo on a sheet having left and right spaces, not to crop the photo. However, this approach has disadvantage that an area of the sheet of the photographic paper is not completely used.
A main problem in the automatic cropping task includes detection and segmentation of a important object (objects) on the image. The method of detecting an important object is divided into two categories. The method based on processing of pixels is to select a small group of pixels or individual pixels corresponding to parts of an object captured from a photo. For example, a method of selecting edges belongs to such a method. The method based on processing of an area is to select an area corresponding to whole semantically significant objects on the image.
Currently, the automatic cropping task is researched only superficially. Software packages of processing an image in which a function of framing a photo is obviously based on selection of main objects of shooting are well known to the authors.
The program XV (www.trilon.com/xv) has a function of automatically cropping an image which is operated as follows:
1. Boundary lines and columns of the image (top and bottom lines and leftmost and rightmost columns) are selected.
2. Variation in brightness in selected lines and columns is detected. Homogeneous lines and columns in a half-tone image are cut completely. Lines and columns in a color image having a low value of spatial and spectral correlation are cut.
3. Operations 1 and 2 are Repeated as Many Times as Necessary.
Accordingly, the program cuts rather homogeneous areas on edges of the image. It does not define the contents of the image as a whole. In practice, dark edges of a scanned image appearing due to misalignment of original before the scanning are effectively deleted. Unacceptable results often appear due to insufficient analysis of contents of a stage.
In U.S. Pat. No. 5,978,519, a method of cropping an image based on difference of intensity levels is considered. A typical image includes an area of homogeneous intensity and color and an area where intensity and color considerably vary. For example, a portrait usually contains sharp brightness conversion from the main object to the background. In the above described method, the size of the image is reduced and it shares on non-overlapped blocks. An average value and a dispersion of intensity for each block are calculated. A threshold is selected based on distribution of the dispersion in blocks and all blocks having dispersion above the threshold are marked as a interest area. The interest areas are then cut by limiting a rectangle.
It is necessary to note that the above method is effective only in the case where an initial image contains an area where a level of intensity considerably varies and an area having a constant level of intensity. It is expected that efficiency of the method will be compared with the program. This method differs from the program in that the program analyzes uniformity of the image line by line, whereas this method analyzes the image block by block. However, both of the methods are inefficiently operated in an image having a non-uniform background.
The function of intellectual cropping of package “Microsoft Digital Image Suite 2006” has an opportunity to detect a face from a portrait or family photo. The program provides some variants of cropping and then the user selects a necessary ratio of sides from a list of standard format of print. Besides, the user sets the sizes of the image in pixels.
As a whole, almost all existing methods for cropping are developed for certain types of images, including photos of human in a rather simple background, museum photos in which a selected object of shooting is in the center of the image with a homogeneous background, and images of modeling stages with several main subjects of various painting and form. Some of these methods are not intended to initially process a certain image, and efficiency of other methods developed using general principles is shown only on a simple image.
U.S. Pat. No. 6,282,317 describes a method of detecting a main object from an image. This method includes receiving a digital image, extracting areas of form and size corresponding to an object presented on the image, grouping areas in larger areas corresponding to the objects physically connected, extracting at least one structurally selected feature and at least one semantically selected feature for each area, and estimating probability of that area corresponding to the main object for each selected area.
U.S. Pat. No. 6,654,506 described a method of framing a digital image, which includes: inputting a confidential card of the image, a value in each point of this image describing importance of information in a corresponding point of the image; selecting a scaling factor and a window of cropping, clustering areas of the confidential card for definition of areas of a background, a secondary area, and areas of a main object, positioning the window of cropping in the field of the main object so as to make the sum of values of trust inside the window maximal; and cropping the image on the borders of the window of cropping.
The laid-open US patent application No. 2002/0191861 describes automatic and semi-automatic framing of images, and, in particular, a device and a method of capturing and framing images using an electronic camera. The electronic device for framing the images includes tools for processing the image, in particular, an electronic processor and a programmed equipment and/or a software for processing of images. The device identifies features of a composition of the image and finds a similar feature from a number of predetermined features stored in the device, for each selected feature. Then one or several predetermined composition rules, connected with stored features, are selected. The device defines one or several suitable borders for framing, by applying one or several selected composition rules.
A method of controlling fragments of the photo is described in the patent application RU 2005137049. According to this reference, fragments can be received by means of operations of mirror display/duplication of lateral parts of an image.
A laid-open US application No. 2007/0025637 has been partially published earlier in Setlur et al, “Automatic Image Retargeting” ACM International Conference on Mobile and Ubiquitous Multimedia (MUM) 2005, v. 154, pp. 59-68. Authors have described a new approach for automatic reformatting of images (automatic image retargeting), keeping proportions of the important objects of the image at reduction of the sizes of the image. For an initial image and the adjusted format, this method executes following actions. First of all, the initial image is segmented on regions based on analysis of distribution of color and brightness characteristics of the image. Then, the map of importance of pixels/regions of the initial image is created based on the model of human sight and methods of detecting of human faces from digital images. If the adjusted format contains all of the important regions, then the image is framed. Otherwise, the important regions are excluded from the image, which is scaled according to the adjusted format. Then, the important regions are inserted into the modified image, according to a degree of importance and topology. The technology, described above, allows minimizing loss of details of the image and lowering distortions, which are caused by traditional approaches. Also it is necessary to note, that this method makes the important regions closer to each other, keeping their topology.
In the article by Shai Avidan, Ariel Shamir, “Seam Carving for Content-Aware Image Resizing ACM Transactions on Graphics, Volume 26, Number 3, SIGGRAPH 2007”, an effective procedure of changing the sizes of the image is described, which considers not only geometrical restrictions, but also the contents of the image. Also, concept and definition of the operator seams scaling (seam carving) are introduced, which is used for reduction and increasing of the image taking into account its content. The seam represents a coherent optimum path from pixels of the image, wherein the optimality is defined using a function of energy of the image. Numerous removal or addition of seams allows achieving change of a format/size of the image.
Recently, there is continuous growth of amount of digital photos and images of documents, which are received and displayed by means of various devices, such as digital cameras, mobile phones, television devices (including devices with high definition), etc. However, the ratio of the geometrical sizes (format) of the digital image does not always correspond to the ratio of the geometrical sizes of area for display. For example, the digital picture has the ratio of the sides of 4:3, and the print is executed on a paper with the ratio of the sides of 3:2. The set of methods is offered for matching the ratio of the sides of a digital picture and area for display. Following methods are most known: scaling, framing and adaptive addition or duplication of parts of the image. These methods are widespread because of their rather simple realizations. However, existing approaches possess two essential disadvantages: reduction of a viewing area due to framing or occurrence of effects of “the stretched image” or “the compressed image”.
Algorithms which are constructed based on idea of scaling of the image, change proportions of objects according to change of the ratio of the sides of the image. Technologies of framing are accompanied by losses of parts of the image, and technologies of addition or duplication of parts of the image introduce an image nonexistent in a reality parts and because of it the resulting image is looked artificially in some cases.
Known technical decisions-analogues do not provide adaptive reformatting of digital images depending on their contents. However, results of reformatting strongly depend on the content of the image, and special steps for prevention of changes of proportions and the sizes of the most important objects on the image are required. The most significant objects on digital photos are images of human, and the most significant objects on images of documents are text inscriptions.
Besides, known analogues in the field of reformatting possess essential restrictions: there is a change of proportions of objects or there are restrictions on the size of resulting image.
In the known analogues, describing technical decisions on change of a format of digital images, technologies of framing and duplication of image elements are used, which are based on operations of addition/removal of group of pixels. The group of pixels represents a horizontal or vertical line from coherent pixels. However, it is desirable that removal or addition of groups of pixels can be made basically on borders of images and/or one or more objects located near the borders of the image are not distorted.