1. Field of the invention
The present invention relates to an apparatus and method for augmenting a video image.
2. Description of the Prior Art
Typical augmented reality (or ‘AR’) applications or systems receive live or recorded video images of a real environment, and then augment these video images with computer graphics in such a way that the computer graphics appear to move or be positioned in a manner consistent with the movement or position of the real environment in the video images.
The effect is to insert the computer graphics (or ‘virtual objects’) into the real environment in a consistent and believable way. Preferably this is done in real time; that is to say, the generation and augmentation of the video is performed at normal video frame rates. A good example of this can be seen in the game ‘Eye Pet’® for the Sony® PlayStation 3® or ‘PS3’®.
Most AR applications or systems achieve this by making use of a real object in the environment whose appearance and dimensions are known, and then in advance encoding the appearance and dimensions of this object as a reference model in a computer. By comparing the scale and orientation of this object as found in the video images with the reference model, it is possible for the computer to calculate the corresponding scale and orientation that should be applied to virtual objects used to augment the image.
To improve the reliability of this process in adverse lighting conditions, or where the video camera in use has a low resolution, often the real object used is a so-called fiduciary marker 800, an example of which is shown in FIG. 1. Such markers typically have a high contrast border and patterning to improve robustness to lighting, and the pattern is typically asymmetric to help resolve the orientation of the marker.
Subsequently the augmentation of the video image by the AR application often positions computer graphics over the fiduciary marker so that a suspension of belief is achieved in the user; for example if the marker is placed on a flat surface, then the whole of the flat surface may be overlaid with a graphical effect (such as a racetrack, or a field or similar). In such cases, if the user wishes to interact with a real or virtual object on top of that flat surface, then the AR application may be able to identify the skin tone of the user and omit (or mask off) the overlaid computer graphics where it coincides with the user's skin in the video image, thereby making the user's skin (e.g. the user's hand) appear to be in front of the graphical layer.
However, often the accuracy of identification of the skin tone can be relatively poor, resulting in noisy or patchy masking of the user's hand over the computer graphics.
The present invention seeks to mitigate or alleviate the above problem.