The use of video surveillance and video analysis to deter or detect theft by customers and employees in retail settings is commonplace. These practices typically create too much video data for humans to effectively manage or review. As such, computerized tools for filtering and mining the video data to determine patterned behavior, anomalistic behavior, or other markers of theft are being increasingly used. These computerized tools typically have difficulty identifying ordinary theft behavior. Further, these computerized tools have particular difficulty identifying sophisticated theft behavior, such as when cashiers “sweetheart” transactions for their own benefit or for a customer's benefit. Sweethearting occurs, for example, when a cashier intentionally bypasses a barcode scanner during the product checkout process. Similar issues arise when cashiers unintentionally bypass the barcode scanner.
Current solutions that attempt to address these problems are typically based on analyzing the data available from retail store systems, such as the point of sale system, to identify behavior that potentially indicates theft. For example, one current solution includes using this identified behavior to trigger manual review of video records from a video surveillance system to provide visual verification of the theft. Another current solution has approached the problem by using computer algorithms to directly analyze the video from the video surveillance system, in order to detect a level of abnormal behavior visually, independent of other data. Both of these solutions have drawbacks. In the former case, it might take a long time for identified patterns to trigger manual review, while in the latter case a high false alarm rate is typically exhibited.
General current solutions in the video analysis field involve performing more sophisticated video analysis in order to extract features from video data. For example, in Chen, Ming-yu and Hauptmann, Alexander, “Active Learning in Multiple Modalities for Semantic Feature Extraction from Video” (2005). Computer Science Department. Paper 976, the authors attempt to improve the way a support vector machine extracts features in video data by performing a linear combination of sub-modeled feature sets. Such general current solutions in the video analysis field do not directly address retail theft detection or the false alarm issue.
Specific current solutions involve integrating video analytics and data analysis in attempts to exploit their combined strengths, in order to compensate for the limitations of previous solutions. For example, in U.S. Patent Pub. No. 2008/0303902 A1, video content of an activity occurring at a monitored facility and transaction data relating to a transaction processed at a transaction terminal are collected and correlated. Subsequently, user-defined rules are applied to the correlated data, and by matching the data with the rules, potentially suspicious transactions are identified. For example, a potentially suspicious transaction is identified when a return transaction has occurred but when no customers are near the point of sale. Current solutions that integrate information in this manner typically suffer from higher than acceptable false alarms rates.