Augmented reality (AR) is generally regarded as presentation of a (typically live) view of a physical, real-world object or scene—augmented by computer-generated elements, such as graphics. A familiar example is display of a virtual “first down line,” in yellow, in a televised football game. The technology sometimes goes by the name of mixed-reality.
AR systems commonly involve one or more cameras to capture video imagery depicting the physical world, together with a display that renders the captured imagery—with augmentation—to a user. The display may be headworn (as in, e.g., the Microsoft Hololens product, and AR contact lenses), but need not be. In addition to the just-noted television example, a smartphone display can be used to provide an AR experience.
In many implementations, an AR augmentation, such as an icon or a graphic indicia, is anchored relative to a particular point within the captured scene, and moves as depiction of this point moves between frames of the captured imagery. In many systems, the particular point is a distinctive feature depicted in the captured imagery. Such a system must thus first locate the feature within the imagery, and then track this feature as its depiction moves between video frames, so the associated augmentation can spatially follow on the display screen.
In some arrangements, the distinctive feature takes the form of an overt symbol or marker that has been added to the scene (or object) in order to enable augmentations. An early example was the ARToolKit marker—a square black and white pattern akin to a QR code. Barcodes, themselves, can similarly be introduced into a scene to serve as AR markers. Such markers are sometimes termed “fiducials,” and commonly enable the viewing system to discern a relative orientation and distance to the marker.
Another form of marker—especially useful with printed objects—is a steganographic pattern. Such a pattern is not evident to human viewers, but can be discerned and localized by a compliant detector. Such technology is commonly known as digital watermarking, and is detailed in exemplary references, below.
More recently, augmentations need not be anchored relative to a marker, per se. Instead, the marker can encode an identifier that enables access to a set of distinctive scene feature points. Augmentations can then be anchored relative to these feature points that naturally occur within the scene.
One such arrangement is offered by Zappar, Ltd., under the name Zapcodes. In that system, an overt machine-readable indicia is included in known imagery, such as cereal box artwork, or a web page. This indicia encodes a plural-bit identifier that is associated—in a remote database—with (1) information about an overlay graphic to be presented to users; and (2) feature point information for the imagery (e.g., cereal box artwork) in which that indicia is found. (This reference imagery may be termed a “tracking image.”)
When a user's smartphone captures imagery of the overt machine-readable indicia, a local app decodes the plural-bit identifier, and sends it to the database. The database responds by sending information about the overlay graphic to the phone, together with the stored feature point (a.k.a. keypoint, or salient point) information for the tracking image referenced by the overt indicia. As the user moves the smartphone relative to the tracking image, these feature points allow the phone to discern its pose relative to the tracking image. The app then adapts the position, scale and orientation of the overlay graphic in accordance with the discerned phone pose, and renders it atop the imagery being captured by the phone camera.
Adjustments to the size of the rendered overlay depend on the spacings of the detected feature points. If the points begin moving further apart, this indicates the camera is moving towards the tracking image, so the overlay graphic is presented at progressively increasing pixel-size. Conversely, if the feature points are moving closer together, this indicates the camera is moving away from the tracking image, so the overlay is presented at progressively decreasing size.
It will be recognized that the just-reviewed arrangement cannot discern any pose information for the phone (relative to the tracking image), until the database has responded with feature point data for that tracking image. If communication with the database is lost, and no feature points can be downloaded, no augmentation can happen (unless the phone has pre-loaded an entire catalogue of tracking images).
Moreover, the just-detailed arrangement requires the tracking image to be known in advance (and pre-processed to identify the feature points) before it can serve as the basis for an AR experience.
Certain embodiments of the present technology redress one or more of these shortcomings, and provide other features in certain instances.