1. Field of the Art
The present specification generally relates to the field of image processing. More specifically, the present specification relates to a system and method for object recognition from multiple images.
2. Description of the Related Art
In many instances it is valuable to use image recognition technology to recognize multiple objects captured over multiple images. For example, in a retail environment it is useful to know what the state of products is on the shelves or displays where particular products are to be stocked at particular locations, but because of consumer activity products can be out of stock or moved to incorrect locations. While a human can move products to their correct locations, it is time consuming to record the position of all products. Therefore it is useful to automatically or semi-automatically obtain information about the state of products on the shelves or displays. One method for obtaining information about the state of products on shelves or displays is to use image recognition technology. However, capturing images in a retail environment can be difficult because of narrow aisles and activity in the store. Therefore multiple images may need to be taken to capture all of the products of interest.
One method for obtaining information about the state of products on shelves or displays using image recognition technology is shown in FIG. 1. At 102, an image stitching module receives multiple input images. The images may be received in a graphic file format such as JPEG, TIFF, PNG, BMP, or the like. The stitching module may be a known stitching module, such as the detailed stitching example code which is part of the OpenCV machine vision software package. At 104, the stitching module stitches the multiple input images into a single stitched image. At 106, this single image is used as input to a recognition module. At 108, the system may output the products recognized from the single stitched image and the stitched image. The products may be output in a machine readable form. For example, the system may produce a JavaScript Object Notation (JSON) file, or an Extensible Markup Language (XML) file, containing a list of items and their location in the stitched image.
Unfortunately, stitching an image can lead to artifacts, and can interfere with optimal operation of the recognition module or produce incorrect recognition results. Thus operating the stitching module before recognition can lead to missed products and incorrectly identified products because of the low quality image input to the recognition module. Thus it is desirable to be able to capture multiple images of shelves and recognize as many products and the locations of those products as possible. It is important to recognize all of the products, but not to double count products that appear in multiple images.