1. Field of the Invention
The present invention is a method and system for automatically analyzing the behavior of people in a physical space based on the information for the trip of the people, by capturing a plurality of input images of the people by a plurality of means for capturing images, processing the plurality of input images in order to track the people in each field of view of the plurality of means for capturing images, and finding information for the trip of the people based on the processed results from the plurality of tracks, where the exemplary behavior analysis comprises map generation as visualization of the behavior, quantitative category measurement, dominant path measurement, category correlation measurement, and category sequence measurement.
2. Background of the Invention
Shoppers' Behavior Analysis:
U.S. Pat. Appl. Pub. No. 2006/0010028 of Sorensen (hereinafter Sorensen 2006/0010028) disclosed a method for tracking shopper movements and behavior in a shopping environment using a video. In Sorensen 2006/0010028, a user indicated a series of screen locations in a display at which the shopper appeared in the video, and the series of screen locations were translated to store map coordinates. The step of receiving the user input via input devices, such as a pointing device or keyboard, makes Sorensen 2006/0010028 inefficient for handling a large amount of video data in a large shopping environment with a relatively complicated store layout, especially over a long period of time. The manual input by a human operator/user cannot efficiently track all of the shoppers in such cases, not to mention the possibility of human errors due to tiredness and boredom. Also, the manual input approach is not scalable according to the number of shopping environments to handle.
Although U.S. Pat. Appl. Pub. No. 2002/0178085 of Sorensen (hereinafter Sorensen 2002/0178085) disclosed a usage of a tracking device and store sensors in a plurality of tracking systems primarily based on wireless technology, such as RFID, Sorensen 2002/0178085 is clearly foreign to the concept of applying computer vision-based tracking algorithms to the field of understanding customers' shopping behavior and movement.
In Sorensen 2002/0178085, each transmitter was typically attached to a handheld or push-type cart. Therefore, Sorensen 2002/0178085 cannot distinguish the behaviors of multiple shoppers using one cart from the behavior of a single shopper also using one cart. Although Sorensen 2002/0178085 disclosed that the transmitter may be attached directly to a shopper via a clip or other form of customer surrogate in order to help in the case where the customer is shopping without a cart, this will not be practical due to the additionally introduced cumbersome step to the shopper, not to mention the inefficiency of managing the transmitter for each individual shopper.
U.S. Pat. No. 6,741,973 of Dove, et al. (hereinafter Dove) disclosed a model of generating customer behavior in a transaction environment. Although Dove disclosed video cameras in a real bank branch as a way to observe the human behavior, Dove is clearly foreign to the concept of automatic and real-time analysis of the customers' behavior based on visual information of the customers in a retail environment, such as shopping path tracking and analysis.
U.S. Pat. Appl. Pub. No. 2003/0053659 of Pavlidis, et al. (hereinafter Pavlidis) disclosed a method for moving object assessment, including an object path of one or more moving objects in a search area, using a plurality of imaging devices and segmentation by background subtraction. In Pavlidis, the object included customers. Pavlidis was primarily related to monitoring a search area for surveillance.
U.S. Pat. Appl. Pub. No. 2004/0120581 of Ozer, et al. (hereinafter Ozer) disclosed a method for identifying the activity of customers for marketing purposes or the activity of objects in a surveillance area, by comparing the detected objects with the graphs from a database. Ozer tracked the movement of different object parts and combined them to high-level activity semantics, using several Hidden Markov Models (HMMs) and a distance classifier. U.S. Pat. Appl. Pub. No. 2004/0131254 of Liang, et al. (hereinafter Liang) also disclosed the Hidden Markov Models (HMMs) as a way, along with rule-based label analysis and the token parsing procedure, to characterize behavior in their disclosure. Liang disclosed a method for monitoring and classifying actions of various objects in a video, using background subtraction for object detection and tracking. Liang is particularly related to animal behavior in a lab for testing drugs. Neither Ozer nor Liang disclosed a method or system for tracking people in a physical space using multiple cameras.
Activity Analysis in Various Other Areas, such as Surveillance Application
There have been earlier attempts for activity analysis in various other areas than understanding customers' shopping behavior, such as surveillance and security applications.
The following prior arts are not restricted to the application area for understanding customers' shopping behaviors in a targeted environment, but they disclosed methods for object activity modeling and analysis for a human body, using a video, in general.
U.S. Pat. Appl. Pub. No. 2002/0085092 of Choi, et al. (hereinafter Choi) disclosed a method for modeling an activity of a human body using the optical flow vector from a video and probability distribution of the feature vectors from the optical flow vector. Choi modeled a plurality of states using the probability distribution of the feature vectors and expressed the activity based on the state transition.
U.S. Pat. Appl. Pub. No. 2004/0113933 of Guler disclosed a method for automatic detection of split and merge events from video streams in a surveillance environment. Guler considered split and merge behaviors as key common simple behavior components in order to analyze high-level activities of interest for surveillance application, which are also used to understand the relationships among multiple objects, and not just individual behavior. Guler used adaptive background subtraction to detect the objects in a video scene, and the objects were tracked to identify the split and merge behaviors. To understand the split and merge behavior-based, high-level events, Guler used a Hidden Markov Model (HMM).
The prior arts lack the features for automatically analyzing the trips of people in a physical space, by capturing multiple input images of the people by multiple means for capturing images and tracking the people in each field of view of the means for capturing images, while joining the track segments across the multiple fields of views and mapping the trips on to the coordinates of the physical space. Essentially, the prior arts lack the features for finding the information of the trips of the people based on the automatically processed results from the plurality of tracks using computer vision algorithms. Therefore, a novel usage of computer vision technologies for understanding the shoppers' trips in a more efficient manner in a physical space, such as a retail environment, is needed.
With regard to the temporal behavior of customers, U.S. Pat. Appl. Pub. No. 2003/0002712 of Steenburgh, et al. (hereinafter Steenburgh) disclosed a method for measuring dwell time of an object, particularly a customer in a retail store, which enters and exits an environment, by tracking the object and matching the entry signature of the object to the exit signature of the object, in order to find out how long people spend in retail stores, using a stereo vision camera. Although Steenburgh is limited to a stereo vision camera, the method in Steenburgh can be used as one of the many exemplary methods to measure the dwell time of people in a physical space. However, Steenburgh is clearly foreign to the idea of analyzing the complex behavior of people in the physical space in combination with other measurement attributes such as trip information.
U.S. Pat. Appl. Pub. No. 2003/0058339 of Trajkovic, et al. (hereinafter Trajkovic) disclosed a method for detecting an event through repetitive patterns of human behavior. Trajkovic learned multidimensional feature data from the repetitive patterns of human behavior and computed a probability density function (PDF) from the data. Then, a method for the PDF analysis, such as Gaussian or clustering techniques, was used to identify the repetitive patterns of behavior and unusual behavior through the variance of the Gaussian distribution or cluster.
Although Trajkovic can model a repetitive behavior through the PDF analysis, Trajkovic is clearly foreign to the aggregate of non-repetitive behaviors, such as the shopper traffic in a physical store. The shopping path of an individual shopper can be repetitive, but each shopping path in a group of aggregated shopping paths of multiple shoppers is not repetitive. Trajkovic is clearly foreign to the challenges that can be found in a retail environment.
A novel usage of computer vision technologies for understanding the shoppers' behavior in a physical space by automatically analyzing the trips of people in the physical space is disclosed in the present invention. The present invention also includes novel methods to extract analytical and statistical data from the trip information that construct various output representations.