1. Field of the Invention
The present invention relates to an image processing apparatus for use in computers for conducting image processing, word processors, portable information tools, copying machines, scanners, facsimiles or the like. More specifically, the present invention relates to an image processing apparatus enabling a user to designate the coordinates of any point on the image by a coordinate input apparatus such as a mouse, a pen or a tablet, or an image processing apparatus capable of photoelectrically converting a printed image on a piece of paper or the like with coordinates being designated in a different type of ink so as to input the image and the coordinates, wherein the image processing apparatus being capable of cutting out an object image with an arbitrary size at an arbitrary position from the original image.
2. Description of the Related Art
When an image including an object or a person""s face of interest is cut out from the original image, the image is cut with a desired size using a pair of scissors, a cutter or the like, in the case of a photograph. In the case of an electronic image obtained by a CCD camera or a scanner, however, the positions of two points are designated by a coordinate input device such as a mouse, using software for image processing (e.g., the image processing software xe2x80x9cPhotoShopxe2x80x9d made by Adobe Inc.), and a rectangle having a diagonal between the two points is designated as a region.
In order to output a part of the original image, which includes an object of interest, as an image having a particular size, a portion having the object of interest at a well-balanced position is first cut out from the original image, and thereafter, is magnified/reduced to a required size. In the case of a photograph, such magnification/reduction is conducted by, for example, a copying machine. In the case of an electronic image, magnifying/reducing the image to a desired size can be easily carried out. However, cutting out a portion having the object of interest at a well-balanced position must be conducted before such magnification/reduction.
Furthermore, in order to extract a region representing a person""s face except for hair (hereinafter, this portion is referred to as a xe2x80x9cface skinxe2x80x9d) from the original image, a face skin region which is visually determined by an operator is painted out. In the case of an electronic image, a pixel is designated by a coordinate input device such as a mouse, and those pixels having a similar color to that of the designated pixel are combined to be extracted as one region (e.g., xe2x80x9cPhotoShopxe2x80x9d as mentioned above). There is also a method as follows: the color distribution of a face skin is analyzed in advance to set a probability density function. Then, the probability density of the input pixels is obtained using values such as RGB (red, green, blue) values and HSV (hue, color saturation, brightness) values as arguments, thereby designating those pixels having a probability equal to or higher than a prescribed value as a face-skin region (R. Funayama, N. Yokoya, H. Iwasa and H. Takemura, xe2x80x9cFacial Component Extraction by Cooperative Active Nets with Global Constraintsxe2x80x9d, Proceedings of 13th International Conference on Pattern Recognition, Vol. 2, pp. 300-305, 1996).
Conventionally, in the case where a rectangle including a face-skin region in the image is determined, the rectangle is commonly determined visually by an operator.
Moreover, the central axis of a person""s face has been commonly detected based on the visual determination of an operator.
Another method for detecting the central axis of the face is as follows: a skin-color portion of the face is extracted as a region, and the region is projected to obtain a histogram. Then, the right and left ends of the face are determined from the histogram, whereby the line passing through the center thereof is determined as the central axis of the face (Japanese Laid-Open Publication No. 7-181012).
Furthermore, respective vertical positions of the nose, the eyes and the mouth on the face have been commonly detected based on the visual determination of an operator.
Another method is to match an image template of the nose with an input image (*Face Recognition: Features versus Templates*, by R. Brunelli and T. Poggio, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.15, No.10, pp.1042-1052, 1993). In this article, a method for detecting the vertical positions by projecting a gray-level image or an edge image to obtain a histogram, and examining peaks and valleys of the histogram, has also been proposed.
Moreover, the width of the face has been commonly detected based on the visual determination of an operator.
Another method is as follows: a skin-color portion of the face is extracted as a region, and the region is projected to obtain a histogram. Then, the right and left ends of the face are determined from the histogram, whereby the distance between the ends is obtained as the width of the face (Japanese Laid-Open Publication No. 7-181012).
As described above, in order to output a part of the original image, which includes a person""s face of interest, as an image having a particular size, a portion having the face at a well-balanced position is first cut out from the original image, and thereafter, is magnified/reduced to a required size. In the case of a photograph, such magnification/reduction is conducted by, for example, a copying machine. In the case of an electronic image, magnifying/reducing the image to a desired size can be carried out easily. However, cutting out a portion having the object of interest at a well-balanced position must be conducted before such magnification/reduction.
In the case of an electronic image, it is also possible for a user to adjust, in advance, the size of the face of the original image to an appropriate size, move a frame on the screen according to the visual determination of the user so that the face is located in the center, and output the image located within the frame. An apparatus achieving such an operation has been proposed in Japanese Laid-Open Publication No. 64-82854.
In order to achieve improved visual recognition of a person""s face on a photograph or an image, the amount of exposure light for printing is adjusted in the case of a photograph. For an electronic image, there is software capable of conducting adjustment of contrast, tonality and brightness, edge sharpening, blurring processing and the like (e.g., xe2x80x9cPhotoShopxe2x80x9d as mentioned above).
When an image including an object or a person""s face of interest is cut out from the original image, the image is cut with a desired size using a pair of scissors, a cutter or the like, in the case of a photograph. However, using a pair of scissors, a cutter or the like to cut an image is actually time-consuming. Moreover, cutting a portion including the object or the face of interest at a well-balanced position requires much skill. When software for processing an electronic image obtained by a CCD camera or converted by a scanner is utilized (e.g., xe2x80x9cPhotoShopxe2x80x9d as mentioned above), the positions of two points are usually designated by a coordinate input device such as a mouse, and a rectangle having a diagonal between the two points is designated as a region. In this case as well, cutting out a portion including an object or a face of interest at a well-balanced position requires much skill. Furthermore, in the case where an object or a face of interest is originally located at the edge of the screen, and a portion including the object or the face at a well-balanced position in the center is to be cut out from the image, it is necessary to first cut out the portion from the original image, and thereafter, move the position of the object or the face to the center of the resultant image.
As described above, in order to output a part of the original image, which includes an object of interest, as an image having a particular size, a portion having the object of interest at a well-balanced position is first cut out from the original image, and thereafter, is magnified/reduced to a required size. In the case of a photograph, such magnification/reduction is conducted by, for example, a copying machine. However, the image is not always cut to the same size. Therefore, in order to obtain an image with a desired size, a troublesome operation of calculating the magnification/reduction ratio is required. In the case of an electronic image, magnifying/reducing the image to a desired size is easy. However, cutting out a portion having the object of interest at a well-balanced position must be conducted before such magnification/reduction. In short, at least two operations are required to output an image having a particular size.
Furthermore, the above-mentioned method of painting out a visually determined face-skin region is troublesome regardless of whether an image to be processed is a photograph or an electronic image. Moreover, painting a portion at the boundary between the face skin region and the other regions must be conducted extremely carefully. In the case of an electronic image, the above-mentioned method of combining those pixels having similar color to that of the designated pixel to extract them as one region (e.g. xe2x80x9cPhotoShopxe2x80x9d as mentioned above) has been used. In this method, however, since the colors of the skin, the lip and the eyes are different, it is necessary to combine the results of several operations in order to extract the whole face-skin. Moreover, the skin color may be significantly uneven even in the same person due to, for example, different skin shades or shadows. In this case as well, the results of several operations must be combined. Also described above is the method of designating those pixels having a probability equal to or higher than a prescribed value as a face-skin region (the above-cited reference by R. Funayama, N. Yokoya, H. Iwasa and H. Takemura). According to this method, however, a face-skin region might not be successfully extracted in the case where the image""s brightness is extremely uneven due to the photographing conditions or the conditions at the time of obtaining the image, or in the case where the color of the skin is different due to a racial difference.
As described above, when a rectangle including a face-skin region is to be obtained, the rectangle has been commonly determined visually by an operator. However, such a method is troublesome regardless of whether an image to be processed is a photograph or an electronic image.
Moreover, in the above-mentioned method of detecting the central axis of a person""s face from a histogram (Japanese Laid-Open Publication No. 7-181012), the correct central axis can only be detected in the case where the face is completely directed to the front, while the correct central axis can not be obtained in the case where the face is turned even slightly to either side.
Furthermore, according to the above-mentioned method of matching an image template of the nose with an input image (the above-cited reference by R. Brunelli and T. Poggio), it is desirable that the size of the nose to be extracted is known. In the case where the size of the nose is not known, templates of various sizes must be matched with the input image, requiring substantial time for calculation. Moreover, according to the above-mentioned method of detecting the vertical positions by examining peaks and valleys of the histogram (the above-cited reference by R. Brunelli and T. Poggio), the vertical positions might not be correctly extracted, for example, in the case where the face skin region or the background is not known. In short, wrong extraction could occur without precondition.
Moreover, according to the above-mentioned method to detect a width of the face (Japanese Laid-Open Publication No. 7-181012), a face skin region should be correctly extracted based on the color information. However, in the case where a background region includes a color similar to that of the face skin, a region other than the face skin region might be determined as a face skin, or a shaded portion in the face skin region might not be determined as face skin. The detected width of the face might be different depending upon whether or not the ears can be seen on the image. Moreover, the detected width could be larger than the correct width in the case where the face is turned toward either side.
As described above, in order to output a part of the original image, which includes an object of interest, as an image having a particular size, a portion having the object of interest at a well-balanced position is first cut out from the image, and thereafter, is magnified/reduced to a required size. In the case of a photograph, such magnification/reduction is conducted by, for example, a copying machine. However, the image is not always cut to the same size. Therefore, in order to obtain an image with a desired size, a troublesome operation of calculating the magnification/reduction ratio is required. In the case of an electronic image, magnifying/reducing the image to a desired size can be easily carried out. However, cutting out a portion having the object of interest at a well-balanced position must be conducted before such magnification/reduction. In short, at least two operations are required to output an image having a particular size. According to a somewhat automated method as described in Japanese Laid-Open Publication No. 64-82854, the user adjusts, in advance, the size of the face of the original image to an appropriate size, moves a frame on the screen according to the visual determination of the user so that the face is located in the center, and output the image located within the frame. Alternatively, the user adjusts, in advance, the size of the face of the original image to an appropriate size, moves a T-shaped indicator on the screen according to the visual determination of the user so that the ends of the horizontal line of the T-shaped indicator overlap the eyes, respectively, and then, outputs an image within a rectangle defined with an appropriate margin from the T-shaped indicator.
The above-described operation of adjusting the amount of exposure light for printing in order to achieve improved visual recognition of a person""s face on a photograph or an image requires much skill. For an electronic image, there is software capable of conducting adjustment of contrast, tonality and brightness, edge sharpening, blurring processing and the like (e.g., xe2x80x9cPhotoShopxe2x80x9d as mentioned above), as described above. In this case as well, using such software requires much skill, and usually, various operations must be tried until a desired image is obtained.
According to one aspect of the present invention, an image processing apparatus includes a designating section for designating an arbitrary region or an arbitrary position of an image; a specifying section for specifying an object region which is present in the designated region or position, and which can additionally be in a vicinity of the designated region or position, from pixel information in the designated region or position; a determining section for determining an image region to be cut out from the image, based on the specified object region; and a cutting section for cutting out the determined image region from the image.
In one example, the determining section includes a section for adjusting a size of the image region to a prescribed size.
In one example, the determining section includes a correcting section for entirely correcting the designated image region or correcting only a part of the designated image region.
According to another aspect of the present invention, an image processing apparatus includes a designating section for designating an arbitrary region or an arbitrary position of an image; an analyzing section for analyzing a color distribution in the designated region or position and in a vicinity of the designated region or position; a adjusting section for adjusting a condition for specifying a face image which is present in the image, according to a result of the analysis; a specifying section for specifying a face image region which is present in the designated region or position, and which can additionally be in the vicinity of the designated region or position, based on the adjusted condition; a determining section for determining an image region to be cut out from the image, based on the specified face image region; and a cutting section for cutting out the determined image region from the image.
In one example, the determining section includes a section for adjusting a size of the image region, using the region or the position designated by the designating section as a reference.
In one example, the specifying section includes a section for applying noise elimination or labelling to the specified face image region to produce a face mask; a section for vertically scanning the produced face mask to obtain a sum of vertical differential luminance values of pixels in the image corresponding to the face mask to produce a histogram; and a section for detecting a central axis of a face from a profile of the produced histogram.
In one example, the specifying section includes a section for applying noise elimination or labelling to the specified face image region to produce a face mask; a section for vertically scanning the produced face mask to obtain a mean luminance value of pixels in the image corresponding to the face mask to produce a histogram; and a section for detecting a vertical nose position from a profile of the produced histogram.
In one example, the specifying section includes a section for applying noise elimination or labelling to the specified face image region to produce a face mask; a section for horizontally scanning the produced face mask to obtain a mean luminance value of pixels in the image corresponding to the face mask to produce a histogram; and a section for detecting a vertical eye position from a profile of the produced histogram.
In one example, the specifying section includes a section for applying noise elimination or labelling to the specified face image region to produce a face mask; a section for horizontally scanning the produced face mask to obtain a mean luminance value of pixels in the image corresponding to the face mask to produce a histogram; and a section for detecting a vertical mouth position from a profile of the produced histogram.
In one example, the specifying section further includes a section for detecting a vertical eye position from the profile of the produced histogram; and a section for obtaining a middle position of a region between the detected vertical eye position and the detected vertical mouth position to detect a width of the face mask at the middle position.
In one example, the determining section includes a section for adjusting a position of the image region, based on the face image region, a central axis of a face in the face image, a vertical nose position of the face in the face image, a vertical eye position of the face in the face image, a vertical mouth position of the face in the face image, and a width of a face mask of the face image.
In one example, the determining section includes a section for adjusting a size of the image region, based on the face image region, a central axis of a face in the face image, a vertical nose position of the face in the face image, a vertical eye position of the face in the face image, a vertical mouth position of the face in the face image, and a width of a face mask of the face image.
In one example, the determining section includes a correcting section for entirely correcting the designated image region or correcting only a part of the designated image region.
Thus, the invention described herein makes possible the advantage of providing an image processing apparatus capable of photoelectrically converting a printed image on a piece of paper or the like with coordinates being designated in a different type of ink so as to input the image and the coordinates, wherein the image processing apparatus being capable of cutting out an object image with an arbitrary size at an arbitrary position from the original image.
This and other advantages of the present invention will become apparent to those skilled in the art upon reading and understanding the following detailed description with reference to the accompanying figures.