Challenges faced by the retail industry include frequent out-of-stock situations, product misplacement and organized retail crime including theft, which result in lower profit margins. Usually, certain checks are carried out by human attendants at pre-defined intervals in order to overcome these challenges. This requires considerable manpower which in turn increases cost. To avoid this situation, some stores use smart shelves having RFID antennae or weight sensors that can provide accurate information about the number of products available on the shelf. Some stores install an array of surveillance cameras which are judiciously mounted so as to monitor their shelves. However, processing surveillance camera images might pose a challenge due to the distance and viewing angle. The complexity of such systems also depends on the exact number of cameras used for the monitoring purpose. Moreover, all the above mentioned methods require modification in the infrastructure of the store. To avoid store modifications, robot-based systems have been suggested for scanning racks to detect and estimate stock level of each product on the shelves. Such systems use cameras or other sensors (barcode reader) for extracting the information. These systems can take images of the products and the shelves at close distances. However, the success of such robot-based systems depends on having a robust method for detecting and counting products directly from images taken from a moving camera. This detection typically depends on finding the difference in a current image and a previous image to know if a product has been restocked or removed. These methods of capturing, transmitting, comparing and then processing images to detect, recognize and count stock require significant amount of time as well as resources. Moreover, multiple cameras are required to capture images from various angles for obtaining accurate information, thus increasing the system cost.
Therefore, to limit the aforementioned drawbacks, there is still a need for a system which provides direct product recognition and count from captured images.