1. Field of the Invention
The present invention relates to methods for creating an image for a three-dimensional display (hereinafter referred to as a 3-D display), for calculating depth information, and for image processing using the depth information. The method for creating an image for a 3-D display particularly relates to a method for creating a pseudo-viewfinder image shot by a multi-eye camera (a stereo image) from a viewfinder image shot by a monocular camera. The method for calculating depth information relates to a method for obtaining a distance between an object and a viewer, which is applicable to practice the method for creating an image for a 3-D display. The method for image processing using the depth information relates to applications including the creation of an image for a 3-D display and the suspending of creation of an image for a 3-D display, an enhanced display, or the like.
2. Description of the Prior Art
[1] Creation of an Image for a 3-D Display
In fields related to television techniques for creating a 3-D image (a pseudo stereoscopic vision) through the detection of a movement of a 2-D motion image have been known. One typical example of such a technique is a 3-D display employing a time difference method, the principal of which will be described with reference to FIGS. 1 to 3.
In a scene where an object moves from left to right while the background stays still, as shown in FIG. 1, by reproducing respective images for right and left eyes (hereinafter respectively referred to as right and left eye images) so as to have a predetermined lapse of time between them, as shown in FIG. 2, a parallax xcex8 is caused, as shown in FIG. 3. xe2x80x9cA parallaxxe2x80x9d or xe2x80x9ca binocular disparityxe2x80x9d is defined as an angular difference between sight vectors directed at one point from right and left eyes, respectively. In FIG. 1, since a viewer perceives the car as being closer than the background due to parallax, a pseudo stereoscopic vision is achieved. When the object, the car in this case, moves in the opposite direction, respective images should be reproduced such that the one for a right eye is reproduced earlier than the one for a left eye by a predetermined time, contrary to the example shown in FIG. 3.
JP Publication No. Sho 55-36240 discloses a display apparatus for a stereoscopic image using depth information, in which the apparatus receives only an image signal shot from one basic direction (that is, a 2-D motion image) among signals from multiple directions and a signal containing the depth information of an object, so as to reproduce within the apparatus the original viewfinder image shot from multiple directions. The purpose of the apparatus is to reduce a transmission bandwidth. The apparatus incorporates a variable delay circuit for causing a time delay while controlling the extent thereof according to depth information. The time delay results in a parallax. According to an output signal from the circuit, image signals are. reproduced for right and left eyes. In this way, a pseudo stereoscopic image is displayed. This publication discloses, as a preferred embodiment of the disclosed apparatus, (1) a device for displaying a pseudo stereoscopic image for a viewer by respectively supplying right and left eye images to two CRT""s, which are situated forming a predetermined angle with a hallxe2x80x2 mirror, and (2) a device for displaying a pseudo stereoscopic image for a viewer even if the viewer moves in a horizontal direction, using a lenticular lens fixed to a display screen.
However, the above apparatus works on the condition that depth information is supplied externally. Therefore, if it only receives a 2-D motion image, the apparatus cannot create a pseudo stereoscopic image.
JP Application Laid-Open No. Hei 7-59119 also discloses an apparatus for creating a pseudo stereoscopic image based on a 2-D motion image. The apparatus comprises a detection circuit for detecting a motion vector from a supplied 2-D motion image, and a delay circuit for delaying either a right or a left image according to the motion vector. The delay causes a parallax. This application discloses, as a preferred embodiment of the disclosed apparatus, a head mounted display (HMD), which is a glasses type display for supplying different images to right and left eyes. Through the HMD, a viewer can see a pseudo stereoscopic image.
In this apparatus, however, since the extent of delay is determined according to the magnitude of a motion vector, any object moving at high speed appears to be closer to the viewer, resulting in an unnatural stereoscopic view, which is discordant to the effective distance between the viewer (or the camera) and the object (that is, a depth).
JP Laid-Open Application No. Sho 60-263594 also discloses an apparatus for displaying a pseudo stereoscopic image using a time difference method, in which the apparatus displays right and left images alternatively for every field, so that they are seen alternatively via shutter glasses for every field, as the shutter glasses alternatively open their right and left eyes. This application further discloses a method for generating a stereoscopic effect by providing a longer time difference between right and left images when an object moves at low speed. However, since this apparatus also does not operate based on depth information, it is not really possible for an accurate pseudo stereoscopic image to be created and thus displayed.
xe2x80x9cPIXELxe2x80x9d (No. 128), a magazine, issued on May 1, 1993 describes in pages 97 to 102 a pseudo stereoscopic image system using depth information. In the system, an object is first displayed as a gray-scale image where the gray-scale level corresponds to the depth, and then based on the gray level, the. appropriate parallax is calculated in terms of the number of pixels, so that right and left images are created to be seen via shutter glasses. However, the perspective image is manually created and a technique for automating the creation is not disclosed.
National Publication No. Hei 4-504333 (WO88/04804) discloses a method for achieving a pseudo stereoscopic image using depth information. The method comprises steps of dividing a 2-D motion image into some areas, for giving the divided areas depth information, so as to provide each of the areas with a parallax, and for creating a pseudo stereoscopic image. However, the depth information is manually supplied and a technique for automating the supply is not disclosed.
In a research field called xe2x80x9cComputer Vision,xe2x80x9d a study has been conducted into a method for estimating a 3-D structure and movement of an object. Concretely speaking, the study, which is aimed at self-control of a robot, relates to acquisition of an accurate distance from a viewpoint to an object by either shooting the object using a stereo camera (a multi-eye camera), or using a monocular camera while moving it. Several aspects oil this technology are described in a report, entitled xe2x80x9c1990 Picture Coding Symposium of Japan (PCSJ90),xe2x80x9d for example, on page 57.
[2] Creation of Depth Information
Computer Vision would enable detection of the depth of an object. However, in the calculation of depth information, which is based on 2-D motion information, suitable images are not always supplied for the calculation. If the calculation is continued even with unsuitable images supplied, serious errors are likely to be caused. That is, if depth information is obtained from such unsuitable images, and then used for the creation of a stereoscopic image, it may be quite likely that the thus created stereoscopic image will be unnatural, i.e., exhibiting such anomalies as a person in the distance appearing closer than a person who actually is closer.
It is to be noted that the idea of obtaining depth information through understanding of a corresponding relationship between frames has been known. For example, JP Application Laid-Open No. Hei 7-71940 (which corresponds to U.S. Pat. No. 5,475,422) mentions, as a prior art, (1) a technique for relating a point or a line between two images shot by a stereo camera, so as to estimate the position of the point or line in actual space (the 3-D world), and (2) a technique for shooting an object on a camera while moving it, so as to obtain its sequential viewfinder images for tracing the movements of a characteristic point on the sequential viewfinder images, and thereby estimating the position of each characteristic point in actual space.
[31 An Image Processing Method Using Depth Information
A method for controlling the movement of a robot using depth, information is known, such as the foregoing Computer Vision. A, method for creating a pseudo stereoscopic image using depth information is also known, such as is disclosed in the foregoing JP Publication No. Sho 55-36240.
On the other hand, a method for using depth information in image processing other than the creation of a pseudo stereoscopic image has scarcely been proposed.
The first object of the present invention relates to the creation of an image for a 3-D display, as described in the above [1]. In defining the object of the present invention, the inventor draws attention to the fact the all of the foregoing techniques for creating a pseudo stereoscopic image have at least one of the following problems to be solved:
1. An accurate stereoscopic image based on depth information is not created. Instead, a mere 3-D effect is provisionally created according to the extent of movement. Further, since a parallax needs to be created using a delay in time (a time difference), horizontal movement of an object is required as a premise of the creation, which should constitutes a fundamental. restriction.
2. As it is not automated, the process for obtaining depth information from a 2-D motion image requires an editing process. Thus, a stereoscopic image cannot be output in real time upon input of the 2-D motion image.
Therefore, the first object of the present invention is to create an accurate stereoscopic image, based on depth information, by applying the foregoing technique relating to a computer vision to an image processing field including technical. fields related to television.
In order to achieve this object, according to the present invention, depth information is extracted from a 2-D motion image, based on which an image for a 3-D display is created. This is applying a technique related to a computer vision to a technical field relating to an image display. According to one aspect of the present invention, depth information is obtained through the following processes: that is, the movement of a 2-D motion image is detected; a relative 3-D movement between the scene and the shooting viewpoint of the 2-D motion image is calculated; and relative distances from the shooting viewpoint to the respective image parts of the 2-D motion image are calculated, based on the relative 3-D movement and the movements of the respective image parts. Based on the thus obtained depth information, a pseudo stereoscopic image is created.
This aspect of the present invention can be differently described as a depth being obtained through the following processes: that is, a plurality of viewfinder frames (hereinafter referred to as frames) are selected from a 2-D motion image to be processed; and a relative positional relationship in the actual 3-D world of the respective image parts is identified based on a 2-D positional displacement between the frames. In other words, in order to determine the depth, 3-D movements of the respective image parts are calculated based on the 2-D positional displacement, based on which positional coordinates of the respective image parts in the 3-D world are calculated, according to the principle of triangulation. A frame is a unit for image processing, that is, a concept including a frame picture or a field picture in MPEG, and the like.
Regarding a 2-D motion image, a plurality of viewfinder frames are hereinafter referred to as xe2x80x9cdifferent-time frames,xe2x80x9d as they are shot at different times. (In the following description of a multi-eye camera, a plurality of frames which are simultaneously shot are referred to as xe2x80x9csame-time frames.xe2x80x9d) A positional displacement on a frame plane is referred to as xe2x80x9ca 2-D positional displacement.xe2x80x9d In this aspect of the present invention, where different-time frames are discussed, xe2x80x9ca 2-D positional displacementxe2x80x9d means a change caused along with a lapse of time, that is, a movement. (On the contrary, xe2x80x9ca 2-D positional displacementxe2x80x9d of same-time frames means a positional difference among a plurality of frames.)
The second object of the present invention relates to the calculation of depth information, as described in the above [2]. That is, the second object of the present invention is to propose a method for obtaining a correct corresponding relationship among a plurality of images, so as to calculate accurate depth information, for selecting an image to be input appropriate for the calculation, and for discontinuing the calculation of depth information when any inconvenience occurs, such as could cause an unnatural pseudo stereoscopic image to be created. Further, the present invention aims to propose methods for effectively determining corresponding and characteristic points, and for searching and tracing points with a high accuracy.
In order to achieve this object, according to the present; invention, two frames with appropriately large movements between them are selected from a 2-D motion image, so that depth information is obtained from the two frames. According to this; aspect of the invention, it is possible to obtain a good calculation result, with pre-selection of frames which may facilitate the calculation. A judgement as to whether frames have appropriately large movement or not may be based on the extent or variance of movement of a characteristic point.
According to another aspect of the invention, with a representative point provided in a reference frame, the similarity of images is evaluated between an image area including a characteristic point in the other frame (hereinafter referred to as an object frame), and an image area including the representative point in the reference frame. A characteristic point is a candidate for a corresponding point subject to all evaluation, the candidate being arbitrarily determined. Then, the relative positional acceptability between the characteristic point and the other characteristic point is evaluated. That is, a judgement is made as to whether the relative positional relationship between the characteristic point and the other characteristic point is reasonable or acceptable with respect to being determined as the same as the relative positional relationship between the representative point and the other representative point, respectively corresponding to the characteristic points. When both evaluations result in a favorable score, the characteristic point is tentatively determined as a corresponding point of the representative point. Subsequently, a best point is searched for where each of the evaluations yield the best result, by moving one corresponding point within a predetermined search area, while assuming that all the other corresponding points are fixed (this method hereinafter being referred to as a fixed searching method). The best position, which has been found during the search, is determined as a new position of the corresponding point. All corresponding points are sequentially subjected to this search and the. positional change. Afterwards, depth information is obtained based on a positional relationship between a representative point and its corresponding point, the corresponding point having been obtained through the above mentioned series of processes.
Conventionally, the similarity of images has been evaluated by block matching or the like. In this invention, on the other hand, by including an additional evaluation on the relative positional evaluation, the corresponding relationship between frames can be more accurately detected. The accuracy can be. further improved through iterative calculations.
According to one aspect of the present invention, the similarity of the images is evaluated by block matching which is modified such that the similarity is correctly evaluated to be highest when the blocks including the identical object are tested, regardless of shooting conditions (hereinafter referred to as biased block matching). As to same-time frames, a certain color deflection tends to occur due to characteristic differences of a plurality of cameras. As to different-time frames, the same problem will arise due to changing weather from time to time, as this causes a change in brightness of a viewfinder image. After correction is made to solve such problems, the similarity of images is transformed to be expressed in the form of a geometrical distance, which is a concept for judging the acceptability of relative positions. Then, the relative positional acceptability and the transformed similarity are combined together to be used for a general judgement on the evaluation results. In this case, biased block matching may be conducted within a correction limitation, which is pre-determined depending on a distance between the reference and object frames. That is, when the distance is larger, a larger correction limitation is set accordingly.
A correction for off-setting a change in brightness is disclosed in JP Laid-Open No. Hei 5-3086630. However, the correction is applicable only to cases of facing-out or facing-in (a consistent changing in brightness), but not to a partial changing in brightness.
According to another aspect of the invention, depth information is obtained through the following processes: that is, a plurality of representative points are provided in a reference frame; a plurality of corresponding points of the representative points are determined in an object frame, so that each corresponds to a respective one of the representative points; and a positional relationship between at least a characteristic point among the representative points, and its corresponding points is obtained. As a characteristic point, a point whose position moves steadily among a plurality of different-time frames is selected, because such a point is considered to be accurately traced.
Likewise, according to another aspect of the present invention, if a point, whose displacement between same-time frames is substantially consistent or changes substantially consistently, also shows similarly consistent movement or change in movement between other same-time frames shot at a close but different time from the above, such a point may be selected as a characteristic point.
According to a further aspect of the present invention, depth information is obtained from the following processes: that is, a plurality of representative points are provided in a reference image; a plurality of corresponding points of the representative points are determined in the other image; and a positional relationship between the representative point and its corresponding point is obtained; depth information is calculated according to the positional relationship, in which the calculation of the depth information is discontinued when an insufficient number of characteristic points are established among the representative points or the movements of the characteristic points are too small, because it is then very unlikely that a positional relationship between viewfinder images will be obtained with a high accuracy.
Two conceptually different corresponding points exist, that is, a true corresponding point and a computed corresponding point. In principle, each representative point has a sole corresponding point, eliminating the possibility of the existence of any other corresponding points in any other positions. This; idealistic sole corresponding point is the true corresponding point. On the other hand, a corresponding point determine through calculations for image processing does not necessarily coincide with the true corresponding point. This is the computed corresponding point, which may possibly exist in any positions other than that of the true corresponding point, and change its position arbitrarily. The positional change may be resorted in, a process for improving the accuracy of the corresponding point, as described later. In this specification, the term xe2x80x9ccorresponding pointxe2x80x9d is used to include both true and computed corresponding points without a distinction between the two concepts, unless it is necessary to differentiate them.
According to a further aspect of the present invention, a depth of a 2-D image is obtained, in which when the depth of any point in a certain image is calculated as negative, the depth is recalculated while referring to the depth information of points close-by with a positive depth value. That is, when a depth is calculated as negative, that is probably because of unsuitable variables being used during the calculation. Therefore, such a negative depth should be corrected based on the depth of a close point.
The third object of the present invention relates to the above [3], that is, utilization of depth information in image processing other than the creation of a pseudo stereoscopic image.
In order to achieve this object, according to the present invention, in creating a stereo image by giving a parallax to a 2-D image according to its depth information, the parallax is first changed so as to fall within a predetermined range, so that the stereo image will be created according to the changed depth information. An excessively large parallax would cause fatigue on a viewer""s eyes. On the contrary, an excessively small parallax would invalidate the meaning of a parallax as data. Therefore, it is necessary to keep a parallax within a desired range.
According to another aspect of the invention, in creating a stereo image by giving a parallax to a 2-D image according to its depth information, the parallax originally determined according to the depth information is set to be variable. With this arrangement, upon an instruction by a viewer to change a parallax, for example, it is possible to create and display a pseudo stereoscopic image which is agreeable to the preference of the viewer.
According to a further aspect of the invention, in creating a stereo image by giving a parallax to a 2-D image according to its depth information and displaying the stereo image on a stereo image display apparatus, a process to be conducted to the 2-D image so as to cause the parallax is determined according to a display condition unique to the stereo image display apparatus. The display condition is governed by the size of a display screen of the display apparatus, and an assumed distance from the display screen to a viewer.
According to a further aspect of the invention, in creating a stereo image by giving a parallax for every image part of a 2-D image according to its depth information, an uneven image frame outline caused by the given parallax is corrected. More particularly, in giving a parallax, if an image area shown at the right end of the screen, for example, is displaced slightly rightward, the image area resultantly projects off the original shape of the image frame, and thus causes uneven parts along the edge of the image frame. A correction made to such an uneven part would straighten the appearance of the frame. The correction may be made by uniformly cutting off a peripheral part of the frame at a certain width, so as to achieve a desired shape of the image frame.
According to a further aspect of the invention, in a method where image processing is carried out for a 2-D image according to its depth information, an image area subject to the image processing is determined, based on the depth information. With this arrangement, it is possible to separate an object or to change the scale of an object a certain distance from a viewer.
According to the further aspect of the invention, in a method where image processing is carried out on a 2-D image according to its depth information, images with viewpoints at a plurality of points on a hypothetical moving path, where a shooting point of the 2-D image is hypothetically moved, are created for use as a slow motion image, based on the depth information.
It is to be noted that, according to the present invention, a viewfinder image seen from a different point may be created according to depth information. A positional displacement of respective image parts, which will be caused accompanying a change in the view point, are calculated based on depth information, so that a viewfinder image is re-created so as to correspond to the positional displacements caused. When a viewpoint is changed in height, for example, a displacement (the extent of translation and rotation movements) of the object (respective image parts) can be-calculated based on the distance by which the camera has moved and the depth information, so that a desired viewfinder image will be created based on the calculation result.