An example of an apparatus capable of estimating the position or posture of an object is a position/posture recognition apparatus for recognizing the position or posture of an object. FIG. 14 is a block diagram showing the arrangement of a conventional position/posture recognition apparatus. This position/posture recognition apparatus includes a posture candidate group determination means 910, comparison image generation means 920, posture selection means 930, and end determination means 940.
The operation of the position/posture recognition apparatus shown in FIG. 14 will be described. Input image data 91 containing the image of an object (to be referred to as a target object hereinafter) as a position/posture estimation target is input to the position/posture recognition apparatus. Rough object position/posture parameters containing known errors are also input to the position/posture recognition apparatus as a position/posture initial value 92. The posture candidate group determination means 910 determines a plurality of position/posture estimation value groups by changing six position/posture parameters (3D parameters in X-, Y- and Z-axis directions and angle parameters about X-, Y-, and Z-axes) contained in the position/posture initial value 92 by a predetermined variation.
On the basis of the 3D shape model data of the target object and a base texture group to generate an illumination variation space, which are stored in the storage unit (not shown) of the position/posture recognition apparatus in advance, the comparison image generation means 920 generates illumination variation space data which represents an image variation caused by a change in illumination condition when the target object has a position/posture corresponding to each position/posture estimation value group. The comparison image generation means 920 generates a comparison image group under the same illumination condition as that for the input image data 91 on the basis of the illumination variation space data.
The posture selection means 930 compares the comparison image group with the input image data 91 and outputs, as an optimum position/posture estimation value 93, a position/posture estimation value corresponding to a comparison image with highest similarity. If there still is room for improvement of the similarity of the comparison image, the end determination means 940 replaces the optimum position/posture estimation value 93 with the position/posture initial value 92 (or current position/posture estimation value) and outputs the value to the posture candidate group determination means 910. The position/posture recognition apparatus repeatedly executes the above-described processing until the similarity of the comparison image cannot be improved anymore, thereby finally obtaining the optimum position/posture of the target object (e.g., Japanese Patent Laid-Open No. 2003-58896 (reference 1)).