1. Technical Field
The present invention relates to a method of image processing in an augmented reality application, comprising the steps of providing at least one image of a real environment and performing image processing in an augmented reality application with the at least one image employing visualization of overlaying digital information with visual impressions or the image of the real environment and employing vision-based processing or tracking. The invention also relates to a computer program product comprising software code sections for performing the method.
2. Background Information
Augmented reality systems and applications present enhanced information of real environment by providing a visualization of overlaying digital information, particularly computer-generated virtual information, with visual impressions or an image of the real environment. The digital information can be any type of visually perceivable data such as objects, texts, drawings, videos, or any combination thereof. The real environment is captured, for example, with a camera held by a user or attached on a device held by a user. The digital information has to be superimposed with the real environment or a part of the real environment in the camera image at a right time, at a right place and in a right way in order to offer a satisfied visual perception to users.
The right time requires that the digital information should be superimposed with the real environment in the image when and only when it is required or necessary, for example, when a particular real object appears in the camera image (e.g., See Kato, H. and M. Billinghurst. Marker Tracking and HMD Calibration for a Video-based Augmented Reality Conferencing System. in 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR 99). 1999. San Francisco, Calif.; referred to hereinafter as “Kato”), or when the camera is positioned at a particular geometrical location (e.g., See S. Feiner et al., “A Touring Machine: Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban Environment,” Proc. 1st Int'l Symp. Wearable Computers (ISWC'97), IEEE CS Press, Los Alamitos, Calif., 1997, pp. 74-81).
The digital information should be superimposed with the real environment at desired pixel positions within the image, for example in a correct perspective way, i.e. adapted and derived from the real environment being viewed. In order to achieve this, the pose of the camera, i.e. orientation and position, with respect to the real environment or a part of it has to be known (e.g., See Kato). Vision is an indispensable component for computing the camera pose in augmented reality applications, as the camera image that captures the real environment can always be a means for camera pose estimation and\or real object detection. Various vision-based online tracking solutions have been developed to compute the pose of the camera for augmented reality applications; e.g., See Sanni Siltanen, Theory and applications of marker-based augmented reality. Espoo 2012. VTT Science 3; which may also be found at http://www.vtt.fi/inf/pdf/science/2012/S3.pdf; hereinafter referred to as “Siltanen”).
The right way means that the digital information should be embedded into the real environment in the image or view of the real environment (for example, viewed by an optical see-through display device) depending on the purpose of an application. For augmented reality applications where virtual information is used to give instructions, draw users' attention, or support the understanding of 3D shapes and dimensions, the virtual information is preferred to be superimposed with real environment such that it is bright and indistinguishable from the real environment. Examples of such applications are augmented assembly and maintenance support; e.g., See Azpiazu, J., Siltanen, S., Multanen, P., Makiranta, A., Barrena, N., Diez, A., Agirre, J. & Smith, T. Remote support for maintenance tasks by the use of Augmented Reality: the ManuVAR. This visualization is so called non-photorealistic visualization. In contrast, photorealistic visualization is preferred in many applications where virtual information is visually indistinguishable from the real environment. Virtual interior design (e.g., See Sanni Siltanen and Charles Woodward. 2006. Augmented interiors with digital camera images. In Proceedings of the 7th Australasian User interface conference—Volume 50 (AUIC '06), Wayne Piekarski (Ed.), Vol. 50. Australian Computer Society, Inc., Darlinghurst, Australia, Australia, 33-36; hereinafter referred to as “Siltanen and Woodward”) and augmented clothing are examples of such applications. For this, many solutions have been developed to have an enhanced visual realism by incorporating occlusion (e.g., See in Ivan J. Jaszlics, Sheila L. Jaszlics, U.S. Pat. No. 6,166,744, System for combining virtual images with real-world scenes; hereinafter referred to as “Jaszlics”), illumination (e.g., See Siltanen and Woodward), noise and motion blur in camera images (e.g., See Jan Fischer, Dirk Bartz, and Wolfgang Strasser. Enhanced visual realism by incorporating camera image effects. In Proceedings of the 5th IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR '06). 2006; hereinafter referred to as “Fischer”) into the visualization. A most recent survey of augmented reality technologies, including tracking, visualization, user interface, etc.; e.g., as presented in Siltanen.
Many visualization techniques for augmented reality applications have been developed to improve visualization of overlaying digital information with visual impressions or an image of the real environment, particularly the visual perception of the overlay of digital information and real environment. In order to have a photorealistic visualization, Siltanen and Woodward erases artificial markers that are used for camera pose estimation from images and introduces lighting and shadow effects to virtual objects. Fischer incorporates camera image effects, e.g. noise, motion blur, to virtual objects. Jaszlics detects the range data of real environment to generate a correct visual effect of the virtual information occluded by the part of real environment in the overlay image of the virtual information and the real environment.
The performance of vision-based tracking solutions is often quantified in terms of re-projection error or the result of similarity measure. The re-projection error corresponds to the pixel distance between a projected point of a real 3D point and a measured one in an image. The similarity measure computes the degree of difference between a transformed reference visual feature and a visual feature in a camera image. Common examples of image similarity measures include the sum-of-squared differences (SSD), cross-correlation, and mutual information. The result of a similarity is a real number. If the similarity measure is the sum-of-squared difference between two image patches, the smaller the similarity measure result is, the more similar the two visual features are. If the similarity measure is the zero normalized cross-correlation between two image patches, the bigger the similarity measure result is, the more similar the two visual features are. The proposed method according to the invention, as set out below, can use any of these techniques.
The parameters (including their values) and operating flow or workflow of the vision-based processing or tracking solutions are always configured such that the similarity measure and\or the re-projection error are minimized; e.g., See Wang, L.; Springer, M.; Heibel, T. H. & Navab, N. (2010), Floyd-Warshall all-pair shortest path for accurate multi-marker calibration., in ‘ISMAR’, IEEE, pp. 277-278. However, in augmented reality applications, the challenge is that none of a minimized similarity measure and a minimized re-projection error could indicate or guarantee a user satisfying overlay of digital information and real environment in a camera image or view of an optical see-through display device. Therefore, even when vision-based processing or tracking solutions are optimized in terms of the similarity measure and\or the re-projection error, the problem of user unsatisfied visualization often happens, such as jittering, virtual information misaligned with real environment, discontinuous display of virtual information.
Augmented reality authoring tools allow people, who have, e.g., no software development, image processing and computer vision background, to create powerful and flexible augmented reality applications in a simple and intuitive way. The authoring tools decouple the usage of the visualization and augmented reality application from the vision-based tracking, detection, or localization solutions. The users of the authoring tools have the knowledge of the usage of the visualization and the properties of the digital information to be overlaid or of the real environment. Only the reference object used for localizing the imaging sensor (camera) in the environment is considered when selecting the type and the parameters of the vision-based tracking algorithms. However, as discussed above, the augmented reality visualization could be disturbed even when the tracking solutions are optimized for the similarity measure and\or the re-projection error.
It would therefore be beneficial to provide a method of image processing in an augmented reality application which is capable to improve the performance and usability of augmented reality applications, particularly augmented reality authoring tools.