Digital watermark technology is known, e.g., from Digimarc's U.S. Pat. Nos. 6,408,082, 6,590,996 and 7,046,819, and publications 20060013395 and 20110274310.
As is familiar to artisans, and as detailed in the cited patents, a digital watermark steganographically conveys a payload of hidden auxiliary data, e.g., in imagery. It also often includes a watermark calibration signal. This calibration signal (which can comprise a known reference signal in a transform domain, such as a pattern of plural impulses in the spatial frequency domain) enables a watermark detector to discern how an image submitted for decoding has been geometrically transformed since it was originally encoded. For example, the calibration signal (which may be called an orientation signal or reference signal) allows the detector to discern an amount by which the image has been shifted in X- and Y-directions (translation), an amount by which it has been changed in scale, and an amount by which it has been rotated. Other transform parameters (e.g., relating to perspective or shear) may also be determined. With knowledge of such “pose” information (geometric state information), the watermark detector can compensate for the geometrical distortion of the image since its original watermarking, and can correctly extract the payload of hidden auxiliary data (watermark message).
As camera-equipped processing devices (e.g., smartphones and point of sale terminals) proliferate, so do the opportunities for watermark technology. However, in certain applications, the computational burden of determining pose (e.g., the scale, rotation and translation of the watermarked object as depicted in imagery captured from the sensor's viewpoint, relative to an original, nominal state) can be an impediment to adoption of the technology.
An example is in supermarket point of sale (POS) scanners that are used to read watermarked product identifiers (e.g., “Global Trade Identifier Numbers,” or GTINs) encoded in artwork of certain retail product packages (e.g., cans of soup, boxes of cereal, etc.). Such POS cameras commonly grab 40-60 frames every second. If all frames are to be processed, each frame must be processed in 25 (or 16) milliseconds, or less. Since watermarked product markings have not yet supplanted barcode markings, and are not expected to do so for many years, POS scanners must presently look for both barcodes and watermarks in captured image frames. The processor chips employed in POS systems are usually modest in their computational capabilities.
For many years, POS scanners processed only barcodes, and were able to apply nearly all of the available processing capability, and nearly the full 25 millisecond frame interval, to the task. With the emergence of watermarked GTINs, POS equipment had to perform two image processing tasks in the time formerly allocated to only one, i.e., now processing both barcodes and watermarks. Given the larger installed base of barcodes, barcode processing gets the lion's share of the processing budget. The smaller processing budget allocated to watermark processing (just a few milliseconds per frame) must encompass both the task of determining the pose with which the object is depicted in the image frame, and then extracting the GTIN identifier through use of the pose data. Between the two tasks, the former is the more intensive.
There are various approaches to determining pose of a watermarked object depicted in imagery. One employs a transform from the pixel (spatial) domain, into a Fourier-Mellin (a form of spatial-frequency) domain, followed by matched filtering, to find the calibration signal within the frame of captured imagery. This is shown, e.g., in U.S. Pat. Nos. 6,424,725 and 6,590,996. Another employs a least squares approach, as detailed in U.S. Pat. No. 9,182,778 and in pending application Ser. No. 15/211,944, filed Jul. 15, 2016, and Ser. No. 15/628,400, filed Jun. 20, 2017. The former method employs processor-intensive operations, such as a domain transformation of the input image data to the Fourier-Mellin domain. The latter method employs simpler operations, but is iterative in nature, so it must cycle in order to converge on a satisfactory output. Both approaches suffer in applications with tight constraints on processing resources and processing time.
The very short increment of time allocated for watermark processing of each captured image, and the computational intensity of the pose-determination task, has been a persistent problem. This has led prior art approaches to resort to analyzing just a very small subset of the captured imagery for watermark data. An illustrative system analyzes just 3 or 4 small areas (e.g., of 128×128 pixels each), scattered across a much larger image frame (e.g., 1280×1024 pixels), or on the order of 5% of the captured imagery.
The performance of watermark-based systems would be vastly improved if the computational complexity of pose determination could be shortcut.
In accordance with certain embodiments of the present technology, object pose is determined without resort to complex or iterative operations. Instead, such embodiments employ a store of reference information to discern the pose with which an object is depicted in captured imagery. Memory lookups are exceedingly fast, and allow pose to be determined with just a small fraction of the computational intensity and time required by previous methods.
In other embodiments, object pose is determined by presenting an excerpt of image-related data to a convolutional neural network, which has been trained with reference data of known object pose to establish the values of its parameters and weights. With a quick sequence of multiply and add operations, the network indicates whether a watermark is present and, if so, information about its pose state.
In still other embodiments, information other than pose state may also be determined, including—in some instances—the payload of the watermark depicted in captured imagery.
By such arrangements, watermark technology can be implemented more effectively in various applications (e.g., point of sale systems), and can be implemented in other applications where it was not previously practical.
The foregoing and additional features and advantages of the present technology will be more readily apparent from the following detailed description, which proceeds with reference to the accompanying drawings.