The exemplary embodiment relates generally to the detection of payment gestures in surveillance video and finds particular application in connection with a system and method which allows for automatic classification and detection of payment gestures in surveillance video.
Technological advancement and increased availability of surveillance technology over the past few decades has enabled companies to perform new tasks with surveillance video. Generally, companies capture and store video footage of retail settings for their own protection and for the security and protection of employees and customers. However, this video footage has uses beyond security and safety, such as its potential for data-mining and estimating consumer behavior and experience. Analysis of video footage may allow for slight improvements in efficiency or customer experience, which in the aggregate can have a large financial impact. Many retailers provide services that are heavily data driven and therefore have an interest in obtaining numerous customer and store metrics, such as queue lengths, experience time both in-store and for drive-through, specific order timing, order accuracy, and customer response.
Several corporations are patenting retail-setting applications for surveillance video beyond well-known security and safety applications. U.S. Pat. No. 5,465,115, issued Nov. 7, 1995, entitled VIDEO TRAFFIC MONITOR FOR RETAIL ESTABLISHMENTS AND THE LIKE, by Conrad et al., counts detected people and records the count according to the direction of movement of the people. U.S. Pat. No. 5,953,055, issued Sep. 14, 1999, entitled SYSTEM AND METHOD FOR DETECTING AND ANALYZING A QUEUE, by Huang et al., U.S. Pat. No. 5,581,625, issued Dec. 3, 1996, entitled STERIO VISION SYSTEM FOR COUNTING ITEMS IN A QUEUE, by Connel, and U.S. Pat. No. 6,195,121, issued Feb. 27, 2001, entitled SYSTEM AND METHOD FOR DETECTING AND ANALYZING A QUEUE, by Huang et. al, each disclose examples of monitoring queues. U.S. Pat. No. 6,654,047, issued Nov. 25, 2003, entitled METHOD OF AND DEVICE FOR ACQUIRING INFORMATION ON A TRAFFIC LINE OF PERSONS, by Lizaka, monitors groups of people within queues. U.S. Pat. No. 7,688,349, issued Mar. 30, 2010, entitled METHOD OF DETECTING AND TRACKING GROUPS OF PEOPLE, by Flickner et al., monitors various behaviors within a reception setting.
While the above-mentioned patents describe data mining applications related to video monitoring, none of them disclose the detection of payment gestures within a retail or surveillance setting. Data driven retailers are showing increased interest in process-related data from which performance metrics can be extracted. One such performance metric is a customer's total experience time (TET) from which guidelines to improve order efficiency and customer satisfaction can be extracted. While prior art teaches how to estimate important components of the TET estimate such as queue length, no techniques have been disclosed on accurate estimation of payment time, which is a key element in TET measurement. Knowledge of additional information relevant to the payment process such as payment type (e.g. credit, debit or cash) would also be useful in the analysis of TET data. Therefore, there is a need for a system and method that automatically detects and classifies payment gestures in surveillance video.
In general, gesture recognition approaches have been based on modeling human movement. Many approaches use local image and video based approaches, as disclosed in LEARNING REALISTIC HUMAN ACTIONS FROM MOVIES, I. Laptev et al. (CVPR 2008), and RECOGNIZING HUMAN ACTIONS: A Local SVM Approach (ICPR 2004), each of these references describing modeling of the human shape during certain action. More recent approaches have employed space-time feature detectors and descriptors, as disclosed in EVALUATION OF LOCAL SPATIO-TEMPORAL FEATURES FOR ACTION RECOGNITION, by H. Wang et al. (BMVC 2009). These gesture recognition based approaches however have not been applied in the context of surveillance video retail applications, from which payment gestures could be detected.
A system and method for automatically detecting and classifying payment gestures in surveillance video is desired. Successful detection of payment gestures with a facile and low computational cost algorithm may prove to be an effective measure in aiding recent efforts by retailers to encapsulate a customer's experience through performance metrics. The method may focus on algorithmic processing of a video sequence to provide accurate detection of various payment gestures at near real-time speeds.