1. Field of the Invention
The present invention relates to image processing methods, image processing devices, image processing programs, and integrated circuits including the stated image processing devices for enhancing the feeling of depth and the three-dimensional effect of a two-dimensional image based on the foreground and background regions of the image.
2. Description of the Related Art
There has been a strong call by users for technology that increases the “feeling of depth” and the “three-dimensional effect” of displayed images in order to display more natural images on the screen of, for example, a large-screen FPD (flat panel display) device. Three-dimensional televisions and the like that utilize the binocular parallax of humans have been proposed in response to this demand, but it has been pointed out that special dedicated glasses are often required, that there is a large degree of dependence on the image, and that the special devices that are required increase costs. At the present time, one of the selling points of large screen display devices is their technology that achieves a three-dimensional effect in the displayed image (video) by smoothing the gradation characteristics or the color characteristics in the displayed image.
It is clear that humans utilize not only binocular parallax but also monocular information, such as color information, saturation (color saturation), brightness, contrast (color information contrast and brightness information contrast), shadows (gradations), gradient of texture, and relative size, in order to perceive depth and three dimensions in two-dimensional images. How to estimate depth information for a two-dimensional image is a key point in improving the “feeling of depth” and “three-dimensional effect” in a displayed image.
Patent Document 1 (JP H10-126708A) can be given as an example of a conventional technique that extracts and processes depth information in order to use such monocular information to increase the senses of depth/distance. In Patent Document 1, a region in which the level of a first order differential signal or a second order differential signal in an image signal is high is taken as the foreground region of the image, whereas a region in which the level of the stated signal is low is taken as the background region of the image, and a sense of distance is imparted on the image by altering the degree to which borders are enhanced.
FIG. 47 is a block diagram that shows the configuration of this conventional image processing device (three-dimensional device) 9100.
The image processing device 9100 enhances borders in an image formed by an inputted image signal by adding edges in the image, where the inputted image signal is a Y luminance signal. The image processing device 9100 is configured of the following: a differential circuit 9101 that performs a differential process on the inputted image signal (input image signal); a distance detection portion 9102 that detects near and far regions of the image based on a first order differential value and a second order differential value of the input image signal; a coefficient weighting portion 9103 that multiplies the second order differential signal by a coefficient value based on the results of the detection performed by the distance detection portion 9102; an adding portion 9105 that adds an input image signal Y′ that has been delayed by a delay portion 9104 with a differential signal EG that has undergone the coefficient weighting performed by the coefficient weighting portion 9103; and the delay portion 9104 for adjusting the timing at which the input image signal is processed.
The distance detection portion 9102 determines whether a target pixel (that is, a pixel corresponding to the input image signal) belongs to the foreground region or the background region based on the signal levels of a signal S3, in which a first order differential signal DY1 has been quantized, and a signal S5, in which a second order differential signal DY2 has been quantized. The distance detection portion 9102 compares the signal level of the signal S3 with a setting value that determines whether or not the pixel belongs to a border portion region, sets a signal S4 to “1” if the signal level of the signal S3 is greater than or equal to the setting value, and sets the signal S4 to “0” if the signal level of the signal S3 is not greater than or equal to the setting value. In regions in which the signal S4 is “1”, the distance detection portion 9102 determines whether or not pixels in border portions are in the foreground or in the background based on whether or not the signal level of the signal S5 (the signal obtained by quantizing the absolute value of the second order differential signal DY2) is greater than a threshold TH. The distance detection portion 9102 determines that the target pixel belongs to the foreground in the case where the signal level of the signal S5 is greater than TH. The distance detection portion 9102 then specifies a value K1, which is a value greater than a default value KS, as a coefficient KM to be multiplied by the second order differential signal DY2, and outputs the specified value to the coefficient weighting portion 9103. Meanwhile, if TH is less than S5, the distance detection portion 9102 determines that the target pixel belongs to the background region. The distance detection portion 9102 then specifies a value K2, which is a value lower than the default value KS, as the coefficient KM to be multiplied by the second order differential signal DY2, and outputs the specified value to the coefficient weighting portion 9103.
In this manner, the image processing device 9100 determines whether a pixel thought to be located on a border portion is present in the foreground or the background regions by determining the level of the second order differential signal DY2 of the pixel relative to a threshold. The image processing device 9100 increases the sense of distance in the vicinity of border portions by increasing or decreasing the coefficient for weighting the second order differential signal DY2, increasing the coefficient when the determination results indicate that the pixel is in the foreground and decreasing the coefficient when the determination results indicate that the pixel is in the background, thereby carrying out a border portion enhancement process.
Various approaches are also being considered for implementing image synthesis and virtual viewpoint movement processes by estimating three-dimensional constructions in two-dimensional images and utilizing the results of the estimation. Among these, there are techniques that calculate a vanishing point in a perspective and implement image synthesis and virtual viewpoint movement processes based on the calculated vanishing point (for example, see Non-Patent Document 1 (Y. Horry, K. Anjyo, and K. Arai: “Tour Into the Picture: Using a Spidery Mesh Interface to Make Animation from a Single Image”, SIGGRAPH '97 Proceedings, pp. 225-232 (1997)) and Patent Document 2 (JP H9-185712A)).
“Tour Into the Picture”, which is discussed in Non-Patent Document 1, makes it possible to remove foreground objects from a photographed image, estimate a vanishing point within the perspective, and, based on the estimated viewpoint, generate a rough construction of a scene for carrying out viewpoint movement. In contrast to “Tour Into the Picture”, in which the depth structure has a tube-like shape whose cross section is a rectangle, the method of Patent Document 2 uses a perspective-based approach, in which the depth structure is a tube whose cross section is a border line according to the depth. The method described in Patent Document 2 creates a pseudo three-dimensional image by treating the camera as the center point (that is, the vanishing point). To be more specific, border line distance information is added to mesh image data to produce three-dimensional polygon object data. Color image data obtained from a photographic image is applied to the three-dimensional polygon object data, and the three-dimensional polygon object constructed by the three-dimensional polygon object data is rendered so that the color image data is pasted on the inside of the three-dimensional polygon object, through which three-dimensional image data is obtained.
With the stated conventional image processing methods, the foreground/background determination is carried out only on border portions that have relatively large differential values, and thus the foreground/background determination is not carried out on, for example, the weak border portions found in texture patterns, border portions that cannot be appropriately extracted due to photographic conditions such as ambient light, and so on. In other words, there is a high probability that border portion extraction based on the first order differential signal will be affected by how accurately the threshold is determined. Furthermore, the foreground/background determination for a pixel thought to be in a border portion is carried out after a thresholding process is first performed on the second order differential signal of that pixel, and thus the depth information determination is also easily affected by how accurately the threshold is determined. Therefore, even if, for example, the border portions belong to the same object and the object is at the same distance as other objects, there is nevertheless a danger that areas where edge enhancement is stronger and weaker will arise, as well as a danger that areas of discontinuous luminance will arise due to only strong and weak edge emphasis being carried out on border portions.
It also cannot be determined whether a low second order differential luminance signal is caused by issues such as blurriness caused by the photographic conditions (focus point shifting or movement and so on) of the image, interpolation executed during a simple upcoversion from a low-resolution image into a high-resolution image, and so on, or if it is caused by the pixel actually being in the background region. For this reason, when, for example, an image encoded using a lossy encoding scheme is decoded and distortion within the decoded image is eliminated using a low-pass filter or the like, there is a danger of the entire image being determined as being the background, due to the settings used for determining the value of the threshold of the second order differential signal, which can result in an inability to appropriately carry out the original edge enhancement process.
Furthermore, with pseudo three-dimensional image forming devices and pseudo three-dimensional image forming methods such as those described in Non-Patent Document 1 and Patent Document 2, it is difficult to automatically determine vanishing points for various different types of images. It is quite difficult to automatically determine the vanishing point using the method disclosed in Patent Document 2, which finds the center point (the vanishing point) based on the point at which the sloping edge lines of the subject intersect with one another, a perspective-based method, and so on. Furthermore, there is a significant chance that a perspective-based structure, based on a vanishing point, cannot create a three-dimensional image that appears natural to the human eye for all inputted scenes.