The present invention relates to a texture information assignment method of assigning texture information to a shape model of a real object of interest according to an object image obtained by shooting that real object of interest, an object extraction method of extracting an object portion by removing an undesired portion such as the background from the object image, a three-dimensional model generation method of generating a three-dimensional model of an object of interest, and apparatus of these methods.
In accordance with the development of computer graphics and the like, there has been intensive efforts to provide a system for practical usage in three-dimensional graphics. However, one appreciable problem in accordance with the spread of such a system of practical usage is the method of obtaining shape data. More specifically, the task of entering the complicated three-dimensional shape of an object having a free-form surface or that resides in the natural world into a computer is extremely tedious and difficult.
Furthermore, in reconstructing an object with a computer and the like, it is difficult to express the texture of the surface of the object in a more realistic manner by just simply reconstructing the shape of the object.
Three-dimensional image information can be handled more easily if the shape information and color/texture information can be reconstructed within the computer based on image information that is obtained by shooting an actual object.
In three-dimensional image communication such as by, for example, the Internet, the opportunity of a general user to create a three-dimensional image who is the transmitter of information will increase. Therefore, the need arises for a simple and compact apparatus that produces a three-dimensional image.
(1) Japanese Patent Laying-Open No. 5-135155 discloses a three-dimensional model generation apparatus that can construct a three-dimensional model from a series of silhouette images of an object of interest placed on a turntable under the condition of normal illumination.
According to this three-dimensional model construction apparatus, an object of interest that is rotated on a turntable is continuously shot by a camera. The silhouette image of the object of interest is extracted from the obtained image by an image processing computer. By measuring the horizontal distance from the contour of the silhouette image to the vertical axis of rotation for the silhouette image, a three-dimensional model is generated according to this horizontal distance and the angle of rotation. More specifically, the contour of the object of interest is extracted from the continuously shot silhouette images to be displayed as a three-dimensional model.
FIG. 1 is a diagram representing the concept of assigning texture information to the three-dimensional model generated as described above according to the image information continuously picked up by a camera.
Japanese Patent Laying-Open No. 5-135155 discloses the case of obtaining image information by continuously rotating an object of interest and shooting the same, i.e., obtaining image information in the resolution level of shape recognition with respect to a three-dimensional model of a human figure. More specifically, an image is picked up for every 1xc2x0 of rotation to obtain 360 images with respect to the object of interest.
For the sake of simplifying the description, the case of shooting an image for every larger stepped angle will be described hereinafter. However, the essence is identical.
Consider the case of picking up a total of n images by rotating an object of interest for every predetermined angle of rotation, as shown in FIG. 1. In this case, each image information corresponds to the label number of 1, 2, 3 . . . , n.
The object of interest is represented as a shape model (wire frame model) 300 using a polygon (triangular patch). When texture information is to be assigned to shape model 300, color information (texture information) of the image information of a corresponding label number is assigned for each triangular patch according to the direction of the camera shooting the object of interest.
More specifically, based upon the vector towards the target triangular patch from the axis of rotation of shape model 300, the texture information with respect to the triangular patch is captured from the image that has the direction of the shooting direction vector and this vector most closely matched. Alternatively, from the standpoint of intuition, a plurality of lines such as the circles of longitude of a terrestrial globe can be assumed with respect to the surface of the model. Texture information can be captured from the first image information for the triangular patch in the range of 0xc2x0 to 1xc3x97360/nxc2x0, from the second image information for the triangular patch in the range of 1xc3x97360/nxc2x0 to 2xc3x97360/nxc2x0, and so on. This method of capturing texture information will be referred to as the central projection system hereinafter.
The central projection system is advantageous in that image information can be provided in a one-to-one correspondence with respect to each triangular patch or the constituent element forming the shape model (referred to as xe2x80x9cthree-dimensional shape constituent elementxe2x80x9d hereinafter), and that this correspondence can be determined easily.
However, the central projection system is disadvantageous in that the joint of the texture is noticeable when the gloss or the texture of the color information is slightly different due to the illumination and the like since the texture information is assigned from different image information (image information of a different label number) to a three-dimensional shape constituent element that is not present within the same range of rotation angle when viewed from the axis of rotation.
Furthermore, a corresponding three-dimensional shape constituent element may be occluded in the image information obtained from a certain direction of pickup depending upon the shape of the object of interest. There is a case where no texture information corresponding to a certain three-dimensional shape constituent element is included in the corresponding image information.
FIG. 2 is a diagram for describing such a situation. In FIG. 2, the relationship is shown of the axis of rotation, the cross section of the object of interest and the object image projected in the camera at a vertical plane including the axis of rotation of the object of interest. When the object of interest takes a shape that has an occluded region that cannot be viewed from the camera as shown in FIG. 2, the image information picked up from this angle direction is absent of the texture information corresponding to this occluded region. However, texture information of this occluded region can be captured from another pickup direction that has a certain angle with respect to the previous direction of pickup.
(2) As a conventional method, extraction of an object portion from an image of an object can be effected manually using an auxiliary tool. More specifically, the image of an object obtained by shooting the target object together with the background is divided into a plurality of regions. The operator selects the background area in the image of the object to erase the background area using a mouse or the like. However, this method is disadvantageous in that the burden on the operator for the manual task is too heavy.
Another conventional method of object extraction employs the chroma-key technique. More specifically, the portion of the object is extracted from the image of the object using a backboard of the same color. However, this method is disadvantageous in that a special environment of a backboard of the same color has to be prepared.
A further conventional method of object extraction employs the simple difference method. More specifically, difference processing is effected between an object image and a background image in which only the background of the object of interest is shot to obtain the difference. The area having an absolute value of the difference greater than the threshold value is extracted as the portion of the object. However, there is a problem that, when the object of interest includes an area of a color identical to the color of the background, that portion cannot be extracted as a portion of the object. In other words, this method is advantageous in that the extraction accuracy of the object portion is poor.
Another conventional method of object extraction takes advantage of the depth information by the stereo method. More specifically, the area with the depth information that is smaller than a threshold value is extracted as the portion of an object of interest from an image of the object obtained by shooting the object together with the background. However, the difference in depth is so great in the proximity of the boundary between the object of interest and the background that proper depth information cannot be obtained reliably. There is a problem that a portion of the background is erroneously extracted as a portion of the object.
All of the above-described conventional methods require the determination of a threshold value in advance. It is extremely difficult to determine an appropriate threshold value on account of the conversion property of the AID converter for converting the image and the property of the illumination. There is also the problem that the threshold value must be reselected when the conversion characteristic of the AID converter or the property of the illumination is changed.
(3) A three-dimensional digitizer is known as a conventional apparatus of reading out the shape of an object of interest. The three-dimensional digitizer includes an arm with a plurality of articulations and a pen. The operator provides control so as to bring the pen in contact with the object of interest. The pen is moved along on the object of interest. The angle of the articulation of the arm varies as the pen is moved. A three-dimensional shape of the object of interest is obtained according to the angle information of the articulation of the arm. However, such a digitizer is disadvantageous in that the time and the labor of the task of measurement by manual means are too great and heavy.
The laser scanner is known as another conventional apparatus. The laser scanner directs a laser beam on an object of interest to scan the object. As a result, a three-dimensional shape of the object of interest is obtained. There is a problem that a three-dimensional model of an object of interest formed of a substance that absorbs light cannot be obtained with such a laser scanner. There is also the problem that the apparatus is extremely complex and costly. Furthermore, there is a problem that the environment for pickup is limited since measurement of the object of interest must be carried out in a dark room. There is also the problem that color information cannot be easily input.
U.S. Pat. No. 4,982,438 discloses a three-dimensional model generation apparatus. This apparatus computes a hypothetical existing region using the silhouette image of an object of interest. This hypothetical existing region is a conical region with the projection center of the camera as the vertex and the silhouette of an object of interest as the cross section. This conical region (hypothetical existing region ) is described with a voxel model. This process is carried out for a plurality of silhouette images. Then, a common hypothetical existing region is obtained to generate a three-dimensional model of the object of interest. Here, the common hypothetical existing region is the ANDed area of a plurality of hypothetical existing regions with respect to the plurality of silhouette images. However, there is a problem that a three-dimensional model of high accuracy cannot be generated when there is one inaccurate silhouette image since the three-dimensional shape is obtained by the AND operation. There is also a problem that color information is insufficient or a local concave area cannot be recognized since the object of interest is shot only from a horizontal direction (direction perpendicular to the axis of rotation).
In the above three-dimensional model generation apparatus of Japanese Patent Laying-Open No. 5-135155, an object of interest that is rotating on a turntable is shot by a camera to obtain a plurality of silhouette images. A plurality of shapes of the object of interest at a plurality of horizontal planes (a plane perpendicular to the axis of rotation) are obtained on the basis of these plurality of silhouette images. The points on the contour line of the shape of the object of interest in adjacent horizontal planes are connected as a triangular patch. The point on the contour line of the shape of the object of interest in one horizontal plane is determined for every predetermined angle. A three-dimensional model of an object of interest is generated in this way. However, there is a problem in this apparatus that a special environment for shooting is required since a backboard to generate a silhouette image is used. Furthermore, the amount of data is great since the three-dimensional model is generated using the shape of the object of interest in a plurality of horizontal planes. There was a problem that the process is time consuming.
In view of the foregoing, an object of the present invention is to provide a method and apparatus of texture information assignment that allows assignment of texture information to each three-dimensional shape constituent element forming a shape model regardless of the shape of the object of interest in the event of reconstructing a three-dimensional model within a computer and the like according to image information obtained by shooting a real object.
Another object of the present invention is to provide a method and apparatus of texture information assignment that allows assignment of texture information approximating the texture of a real object from image information obtained by shooting a real object in the assignment of texture information to a shape model according to picked up image information.
A further object of the present invention is to provide a method and apparatus of texture information assignment with less noticeable discontinuity (seam) in texture assigned to each three-dimensional shape constituent element constructing a shape model in assigning texture information to the shape model according to image information obtained by shooting a real object.
Still another object of the present invention is to provide a method and apparatus of object extraction that allows a portion, if present, of an object of image having a color identical to that of the background extracted.
A still further object of the present invention is to provide a method and apparatus of object extraction that can extract always stably and properly a portion of an object even when various characteristics change.
Yet a further object of the present invention is to provide a method and apparatus of object extraction that can have manual task reduced, and dispensable of a special shooting environment.
Yet another object of the present invention is to provide a method and apparatus of three-dimensional model generation that can have manual task reduced.
Yet a still further object of the present invention is to provide a method and apparatus of three-dimensional model generation of a simple structure with few limitation in the shooting environment and substance of the object of interest.
An additional object of the present invention is to provide a method and apparatus of three-dimensional model generation that can generate a three-dimensional model in high accuracy even if there are several inaccurate ones in a plurality of silhouette images.
Still a further object of the present invention is to provide a method and apparatus of three-dimensional model generation in which sufficient color information can be obtained and that allows recognition of a local concave portion in an object of interest.
Yet a still further object of the present invention is to provide a method and apparatus of three-dimensional model generation that can generate a three-dimensional model at high speed with fewer data to be processed, dispensable of a special shooting environment.
According to an aspect of the present invention, a texture information assignment apparatus for a shape model includes: means for describing the shape of an object of interest as a shape model by a set of a plurality of three-dimensional shape constituent elements; and means for assigning texture information with respect to a shape model according to the amount of texture information for a three-dimensional shape constituent element of each object image information per three-dimensional shape constituent element on the basis of a plurality of object images information captured by shooting an object of interest from different view points.
Preferably, the texture information amount is represented by the matching degree between the direction of the surface normal of each three-dimensional shape constituent element and the shooting direction of each object image information per three-dimensional shape constituent element.
Preferably, the texture information amount is represented by the area of the three-dimensional shape constituent element that is projected on each object image information per three-dimensional shape constituent element.
According to another aspect of the present invention, a texture information assignment apparatus for a shape model includes: means for describing the shape of an object of interest as a shape model by a set of a plurality of three-dimensional shape constituent elements; and means for assigning per three-dimensional shape constituent element the texture information for a shape model according to both the texture information amount for the three-dimensional shape constituent element of each object image information and the texture continuity between three-dimensional shape constituent elements on the basis of a plurality of object images information captured by shooting the object of interest from different viewpoints.
Preferably, the texture information assignment means assigns the texture information for a shape model from the object image information provided in correspondence with each three-dimensional shape constituent element so as to set minimum an evaluation function that decreases in accordance with increase of the texture information amount and that decreases in accordance with improvement in texture continuity between three-dimensional shape constituent elements.
In the above evaluation function, the texture continuity is represented as a function of difference in the shooting position and the shooting direction of respective corresponding object image information between a three-dimensional shape constituent element of interest and an adjacent three-dimensional shape constituent element.
Preferably in the above evaluation function, the texture continuity is represented as a function that increases in accordance with a greater difference between the label number assigned to a three-dimensional shape constituent element of interest and the label number assigned to a three-dimensional shape constituent element that is adjacent to the three-dimensional shape constituent element of interest when object image information is picked up according to change in position and a label number is applied to each object image information corresponding to the change in position.
Preferably in the above evaluation function, the texture continuity is represented as a function that increases in accordance with a greater difference between the label number assigned to a three-dimensional shape constituent element of interest and the label number assigned to a three-dimensional shape constituent element adjacent to the three-dimensional shape constituent element of interest when object image information is picked up according to a regular change in position and a label number is applied to each object image information corresponding to the change in position.
Preferably in the above evaluation function, the texture information amount is represented as a function of an area of a three-dimensional shape constituent element projected on each object image information per three-dimensional shape constituent element.
Preferably in the above evaluation function, the texture information amount is represented as a function of a level of match between the direction of the surface normal of each three-dimensional shape constituent element and the shooting direction of each three-dimensional shape constituent element per three-dimensional shape constituent element.
Preferably, the above evaluation function is represented as a linear combination of the total sum of the difference between the label number assigned to the i-th (i: natural number) three-dimensional shape constituent element and the label number assigned to the three-dimensional shape constituent element adjacent to the i-th three-dimensional shape constituent element for all three-dimensional shape constituent elements, and the total sum of the area of the i-th three-dimensional shape constituent element projected on the object image information corresponding to the label number assigned to the i-th three-dimensional shape constituent element for all three-dimensional shape constituent elements.
According to a further aspect of the present invention, a texture information assignment apparatus for a shape model includes: means for describing the shape of an object of interest as a shape model by a set of a plurality of three-dimensional shape constituent elements; means for providing correspondence between a label number and every three-dimensional shape constituent element so as to set minimum an evaluation function that decreases in accordance with increase of a texture information amount for each three-dimensional shape constituent element and that decreases in accordance with improvement of texture continuity in the texture information assigned to each three-dimensional shape constituent element and an adjacent three-dimensional shape constituent element when a plurality of object images information are picked up in accordance with change in position and a label number is applied to each object image information corresponding to change in position; and means for assigning texture information to a three-dimensional shape constituent element by carrying out a weighted mean process according to the area of a three-dimensional shape constituent element projected on each object image information on the basis of object image information corresponding to the related label number and the object image information corresponding to a predetermined number of label numbers including that related label number.
Preferably, the means for assigning texture information to the three-dimensional shape constituent element obtains the area projected on the object image information corresponding to the label number related to the three-dimensional shape constituent element and the object image information corresponding to the predetermined number of label numbers including the related label number for the three-dimensional shape constituent element, and uses this as the weighting factor in carrying out a weighted mean process. For the texture information of the three-dimensional shape constituent element, the portion of the three-dimensional shape constituent element projected on the object image information is obtained. The image information (color, density or luminance) of this projected portion is subjected to a weighted mean process to result in the texture information.
According to still another aspect of the present invention, a texture information assignment apparatus for a shape model includes: means for describing the shape of an object of interest as a shape model by a set of a plurality of three-dimensional shape constituent elements; means for providing correspondence between a label number and every three-dimensional shape constituent element so as to set minimum an evaluation function that decreases in accordance with increase of texture information amount for each three-dimensional shape constituent element and that decreases in accordance with improvement in texture continuity of texture information respectively assigned to each three-dimensional shape constituent element and an adjacent three-dimensional shape constituent element when a plurality of object image information are picked up according to regular change in position and a label number is applied to each object image information corresponding to change in position; and means for assigning texture information to a three-dimensional shape constituent element by carrying out a weighted means process according to an area of a three-dimensional shape constituent element projected on each object image information on the basis of the object image information corresponding to a related label number and the object image information corresponding to a predetermined number of label numbers including that related label number.
Preferably, the means for assigning texture information to a three-dimensional shape constituent element obtains the area projected on the object image information corresponding to the label number related to a three-dimensional shape constituent element and the object image information corresponding to the predetermined number of label numbers including the related label number for the three-dimensional shape constituent element, and uses this as the weighting factor for a weighted mean process. For the texture information of a three-dimensional shape constituent element, the portion where the three-dimensional shape constituent element is projected on the object image information is obtained. The image information (color, density or luminance) of this projected portion is subjected to a weighted mean process to result in the texture information.
According to a still further aspect of the present invention, a texture information assignment apparatus for a shape model includes: means for capturing a plurality of object images information by shooting an object of interest from different viewpoints; means for describing the shape of the object of interest as a shape model by a set of a plurality of three-dimensional shape constituent elements; and means for assigning texture information obtained by carrying out a weighted mean process for all the object image information according to the area corresponding to the three-dimensional shape constituent element projected on the plurality of object images information for every three-dimensional shape constituent element.
Preferably, the means for assigning texture information to the three-dimensional shape constituent element obtains the area projected on the object image information for each three-dimensional shape constituent element, and uses the obtained area as the weighting factor in carrying out the weighted mean process. For the texture information of the three-dimensional shape constituent element, the portion of the three-dimensional shape constituent element projected on the object image information is obtained. The image information (color, density or luminance) of this projected portion is subjected to a weighted means process to result in the texture information.
According to the texture information assignment apparatus, the most appropriate texture information of the actual object can be selectively assigned to the shape model, out from the plurality of image information obtained by shooting an object of interest when the shape model is reconstructed within a computer on the basis of image information obtained by shooting an actual object.
When texture information (color information) is to be assigned to the shape model represented as a set of a plurality of three-dimensional shape constituent elements, the texture information most approximating the texture information of the actual object can be selectively assigned to each three-dimensional shape constituent element while suppressing discontinuity in texture information between respective three-dimensional shape constituent elements.
Since the process of assigning texture information can be carried out by substitution with the labeling issue for each three-dimensional shape constituent element on the basis of the object image information obtained by shooting an actual object of interest, the process of applying the texture information to each three-dimensional shape constituent element can be carried out in a procedure suitable for computer processing and the like.
According to yet a further aspect of the present invention, an object extraction apparatus of extracting a portion of an object with an unwanted area removed from an object image obtained by shooting an object of interest includes: region segmentation means and extraction means. The region segmentation means divides the object image into a plurality of regions. The extraction means identifies and extracts an object portion in the object image by subjecting the information of each pixel in the object image to a process of consolidation for every region. Here, an unwanted portion is, for example, the background area.
Preferably in the extraction means, the process of consolidating the information of each pixel in the object image for every region is to average the information of each pixel in the object image for every region.
Preferably, the extraction means identifies and extracts the object portion in the object image by carrying out a thresholding process on the information of each pixel consolidated for every region.
Preferably, the information of each pixel in the object image is the difference information obtained by carrying out a difference process between a background image obtained by shooting only the background of the object of interest and an object image.
Preferably, the extraction means includes difference processing means, mean value output means, and threshold value processing means. The difference processing means carries out a difference process between the background image obtained by shooting only the background of the object of interest and the object image. The mean value output means obtains the mean value in each region for the absolute value of the difference obtained by the difference process. The threshold value processing means compares the mean value in a region with a predetermined value to extract the region where the mean value is equal to or greater than a predetermined value as the object portion.
Preferably, the extraction means comprises mean value output means, difference processing means, and threshold value processing means. The mean value output means computes the mean value of the pixel in each region of the object image. The difference processing means carries out a difference process between the mean value of the pixels in each region of the object image and the mean value of the pixels in a corresponding region of the background image. The threshold processing means compares the absolute value of the difference obtained by the difference process with a predetermined value to extract the region where the absolute value of the difference is greater than the predetermined value as the object portion.
Preferably, the information of each pixel of the object image is the depth information.
According to yet another aspect of the present invention, the object extraction apparatus of extracting an object portion with an unwanted area removed from the object image obtained by shooting the object of interest includes: depth information computation means, region segmentation means, mean value computation means, and extract means. The depth information computation means computes the depth information of the object image. The region segmentation means divides the object image into a plurality of regions. The mean value computation means computes the mean value of the depth information for each region. The extract means extracts as an object portion a region out of the plurality of regions that has a mean value within a predetermined range, i.e. a region having a mean value smaller than a predetermined value, particularly when an object located forward than the object of interest is not included in the object image.
According to yet a still further aspect of the present invention, an object extraction apparatus of extracting a portion of an object with an unwanted portion removed from the object image on the basis of an object image obtained by shooting an object of interest and a plurality of background images obtained by shooting only the background of the object of interest a plurality of times includes difference means, extraction means, and threshold value determination means. The difference means computes the difference between the object image and the background image. The extraction means extracts a portion of the object image having a difference greater than the threshold value as the object portion. The threshold value determination means determines the threshold value in a statistical manner on the basis of distribution of the plurality of background images.
According to an additional aspect of the present invention, an object extraction apparatus of extracting a portion of an object with an unwanted portion removed from an object image on the basis of an object image obtained by shooting an object of interest and a plurality of background images obtained by shooting only the background of the object of interest a plurality of times includes computation means, difference means, and extraction means. The computation means computes for every pixel the mean value and the standard deviation of the pixels located at the same coordinates in the plurality of background images. The difference means computes the difference between the value of each pixel in the object image and the mean value of the pixels in the background images corresponding to that pixel. The extraction means extracts the pixel from the object image having a difference that is greater than a predetermined times the standard deviation as the object portion.
According to yet a further aspect of the present invention, an object extraction apparatus of extracting a portion of an object with an unwanted portion removed from an object image on the basis of an object image obtained by shooting an object of interest and a plurality of background images obtained by shooting only the background of the object of interest a plurality of times includes average/standard deviation computation means, region segmentation means, difference means, average difference computation means, average standard deviation computation means, and extract means. The average/standard deviation computation means computes for every pixel the mean value and the standard deviation of pixels located at the same coordinates in a plurality of background images. The region segmentation means divides the object image into a plurality of regions. The difference means computes the difference between the value of each pixel in each region of the object image and the mean value of the corresponding pixels in the region of the background images corresponding to that region. The average difference computation means computes the average in difference for every each region. The average standard deviation computation means computes the mean value of the standard deviation for every region. The extract means extracts the region out of the plurality of regions having the mean value of the difference greater than a predetermined times the mean value of the standard deviation.
According to still another aspect of the present invention, an object extraction apparatus of extracting a portion of an image with an unwanted portion removed from an object image on the basis of an object image obtained by shooting an object of interest and a plurality of background images obtained by shooting only the background of the object of interest a plurality of times includes average/standard deviation computation means, region segmentation means, average computation means, difference means, average difference computation means, average standard deviation computation means and extract means. The average/standard deviation computation means computes for each pixel the mean value and the standard deviation of pixels located at the same coordinates in the plurality of background images. The region segmentation means divides the object image into a plurality of regions. The average computation means computes the mean value of a pixel in each region. The difference means computes the absolute value of difference between the mean value of pixels in each region of the object image and the mean value of the pixels in the region of the background images corresponding to that region. The average difference computation means computes the mean value of the absolute values of the difference for each region. The average standard deviation computation means computes the mean value of the standard deviation for each region. The extract means extracts a region out of the plurality of regions having a mean value of absolute values of difference greater than a predetermined times the mean value of the standard deviation.
According to yet another aspect of the present invention, an object extraction apparatus of extracting a portion of an object with an unwanted portion removed from an object of image on the basis of an object image obtained by shooting an object of interest and a plurality of background images obtained by shooting only the background of the object of image for a plurality of times includes average/standard deviation computation means, region segmentation means, average computation means, difference means, average standard deviation computation means, and extract means. The average/standard deviation computation means computes for each pixel the mean value and the standard deviation of pixels located at the same coordinates in the plurality of background images. The region segmentation means divides the object image into a plurality of regions. The average computation means computes the mean value of the pixels in each region of the object image, and also the mean values in each region of the mean value of the pixels in the background images. The difference means computes the absolute value of the difference between the mean value of the pixels in each region of the object image and the mean value in each region of the mean values of the pixels in the region of the background images corresponding to that region. The average standard deviation computation means computes the mean value of the standard deviation for each region. The extract means extracts a region out of the plurality of regions having an absolute value of difference greater than a predetermined times the mean value of the standard deviation as an object portion.
According to still another aspect of the present invention, an object extraction apparatus of extracting an object portion with an unwanted portion removed from an object image on the basis of a plurality of object images obtained by shooting an object of interest a plurality of times and a plurality of background images obtained by shooting only the background of the object of interest a plurality of times includes average/standard deviation computation means, average computation means, region segmentation means, difference means, average difference computation means, average standard deviation computation means, and extract means. The average/standard deviation computation means computes for each pixel the mean value and the standard deviation of pixels located at the same coordinates in the plurality of background images. The average computation means computes for each pixel the mean value of the pixels located at the same coordinate in the plurality of object images. The region segmentation means divides the object image into a plurality of regions. The difference means computes an absolute value of difference between the mean value of respective pixels in each region of the object image and the mean value of corresponding pixels in the region of the background image corresponding to the relevant region. The average difference computation means computes the mean value of the absolute values of difference for every region. The average standard deviation computation means computes the mean value of the standard deviation for each region. The extract means extracts a region out of the plurality of regions having a mean value of the absolute values of difference greater than a predetermined times the mean value of the standard deviation.
According to the above object extraction apparatus, a portion in the object of interest of a color identical to that of the background, if any, can be detected and extracted as a portion of the object. The task to be carried out manually can be reduced. Also, a special shooting environment is dispensable.
According to yet a further aspect of the present invention, a three-dimensional model generation apparatus for generating a three-dimensional model of an object of interest includes: shooting means for shooting the background of an object of interest and shooting the object of interest including the background; silhouette generation means obtaining the difference between a background image obtained by shooting only the background and a plurality of object images obtained by shooting the object of interest with the background for generating a plurality of silhouette images; and means for generating a three-dimensional model of the object of interest using the plurality of silhouette images.
The three-dimensional model generation apparatus preferably includes rotary means for rotating the object of interest.
According to yet an additional aspect of the present invention, a three-dimensional model generation apparatus of generating a three-dimensional model of an object of interest includes: silhouette generation means for generating a plurality of silhouette images of the object of interest, estimation means for estimating the existing region of the object of interest in a voxel space according to the plurality of silhouette images; and means for generating a three-dimensional model of the object of interest using the object of interest existing region obtained by the estimation means.
Preferably, the estimation means carries out a voting process on the voxel space.
Preferably, the three-dimensional model generation apparatus further includes threshold value processing means for setting the portion having a vote score greater than a predetermined threshold value as a result of the voting process.
According to the above three-dimensional model generation apparatus, a special shooting environment such as a backboard of the same color is dispensable since a three-dimensional model is generated using a silhouette image obtained by carrying out difference processing.
Since a three-dimensional model is generated by carrying out a voting process on voxel space on the basis of a plurality of silhouette images, a three-dimensional model can be generated at high accuracy even when some of the plurality of silhouette images is improper.
Since the three-dimensional model is generated by polygonal approximation of the contour line of a plurality of cut out planes obtained by cutting a three-dimensional shape of an object of image, the amount of data for three-dimensional model generation can be reduced to allow high speed processing.
Since a three-dimensional model is generated by polygonal approximation of the contour line of a plurality of cross sectional shapes of an object of interest, the amount of data for three-dimensional model generation can be reduced to allow high speed processing.