1. Field
This disclosure relates to a system for performing video content analysis (VCA) using depth information to assist retailers in monitoring various aspects of their business, which may help monitor their stores and/or actions of their customers, employees, suppliers etc., help optimize store processes, increase sales, reduce theft and/or increase safety, for example.
2. Background
Use of video to monitor retail environments can be very helpful for a business owner. Video can be reviewed in real time, or later after storage, to track inventory, actions of customers and/or employees, to assist in theft detection, e.g. As the amount of areas and related video to be reviewed increases, it has become impracticable for personnel to constantly review and analyze all recorded video to obtain various information desired by retailers. To assist in this function, video content analysis systems have been designed. In a video content analysis (VCA) system, video streams are automatically analyzed to identify and classify objects, and to determine physical and temporal attributes of the objects. As a result, a log of analytics data may be stored. The analytics data may be used to determine events that occur in real time or at a later time, to aid in searching for objects or detected events, and for other purposes. An example of a VCA system is described in U.S. Pat. No. 7,932,923, issued to Lipton et al. on Apr. 26, 2011 (the '923 patent) and U.S. Pat. No. 7,868,912 issued to Venetianer et al. on Mar. 11, 2011, the contents of each of which are incorporated herein by reference in their entirety.
For example, in a video surveillance system at a grocery store, objects such as customers, cashiers, food items can be identified and related events may be detected to cause notification or some other action in response. For example, customers and cashiers may be identified and their actions reviewed for theft. Shelf space and items (or lack thereof) on the shelf space may be identified to assist in restocking and/or tracking merchandising effects. At an advertising display, objects such as people at the facility can be detected and tracked, and information about the people, such as an amount of time spent by an individual at a particular location, such as the advertising display, at the facility can be collected.
Some existing systems use RGB (red green blue) CMYK (cyan magenta yellow key), YCbCr or other image sensors that sense images in a two-dimensional manner and perform analysis of those images to perform object and event detection. However, identifying objects and related actions using RGB image sensors may be prone to error. For example, a VCA system may make a determination that an object is a human based on an analysis of the shape of the detected object (e.g., the detected object has a certain shape, such as a particular size relationship of a detected torso, head and arm/leg appendages). However, such analysis to determine that an object is a human may equally apply to the shadow of a human in the store. If the VCA system is interested in determining if a customer has slipped and fallen, an improper determination that a shadow is a human object may improperly alert store management to a slip and fall event. Similarly, objects in a shopping cart (e.g., a pumpkin or watermelon) may improperly be identified as a human (e.g., the head of a human).
The embodiments described here address some of these problems of existing retail monitoring systems, and provide use of depth and/or height data to assist in monitoring a retail environment. As a result, a more accurate system and method for detecting and tracking customers, employees and/or inventory, e.g., may be achieved.