Devices such digital cameras, phones with embedded cameras, or other camera or sensor devices may be used to identify and track objects in three-dimensional environments. This may be used to create augmented reality displays where information on objects recognized by a system may be presented to a user that is observing a display of the system. Such information may be presented on an overlay of the real environment in a device's display. Information from a database of objects may then be used to identify objects in the environment observed by a device.
Mobile devices in particular with embedded digital cameras may have limited storage and processing, particularly in comparison to powerful fixed installation server systems. One way of reducing the processing and bandwidth load of a system implementing such object detection/tracking is to store a local database of object information that may be used to identify objects in the environment. This database information may essentially be considered assistance information to help a device identify objects using templates that are stored in a database. When a device is operating in an augmented reality or object identification mode, images captured by the device are compared with object representations in a database to determine if there is an object match, and if so, what the current pose of the camera is compared to the identified object. When an object match occurs, a responsive action may be initiated or additional information related to the object may be presented in a device display in conjunction with the image containing the identified object.
While systems exist for creating such database information, the existing systems are not scalable to a broad variety of mobile devices. One embodiment of such an existing system uses combined geometric/texture models of the object of interest. These models are sometimes known at the object production stage (CAD models), but in most cases they are unavailable. Another known method is to use a laser-based or IR-based scanning system to simultaneously estimate the geometry and collect images of an object. However, such scanning systems are typically expensive, and yet are texture challenged due to physical limitations of different sensors used. Thus, in general, the models are either unavailable or somewhat inaccurate to the point where they affect detection performance.
Systems and methods for creating three-dimensional object representations for use in computer vision as described herein may provide improvements and simplification in the way object representations are currently obtained for use in detection and tracking systems.