1. Field of the Invention
The present invention relates generally to a system and method for tracking objects, such as people, using one or more video cameras. In one embodiment, the present invention may be used to track people through a retail environment.
2. Description of the Prior Art
Basic video tracking systems are well known in the art. The video tracking systems previously known lack certain functional capabilities required for generating accurate and comprehensive tracking information, especially while using multiple video cameras.
Primarily to date, research and development has been focused on single camera tracking solutions. For example, see Celenk et al. in a 1988 IEEE article entitled “Moving Object Tracking Using Local Windows”; Tsai et al. in IEEE articles, published in 1981, entitled “Estimating Three-Dimensional Motion Parameters Of A Rigid Planar Patch, and Uniqueness” and “Estimation Of Three-Dimensional Motion Parameters Of Rigid Objects With Curved Surfaces”; Liao in a 1994 article entitled “Tracking Human Movements Using Finite Element Methods”; Montera et al. in a 1993 SPIE article entitled “Object Tracking Through Adaptive Correlation”; Burt et. al. in a 1989 article entitled “Object Tracking With A Moving Camera”; Sethi et al. in a 1987 article entitled “Finding Trajectories Of Feature Points In A Monocular Image Sequence”; and Salari et al. in a 1990 article entitled “Feature Point Correspondence In The Presence Of Occlusion.”
Q. Cai et al. in an article entitled “Automatic Tracking of Human Motion in Indoor Scenes Across Multiple Synchronized Video Streams” describes a method for object tracking through multiple cameras. This solution is limited by the fact that all Single View Tracking systems must be accurately time synchronized in order to support accurate camera hand-off. Also, intensity features are used for camera-to-camera hand-off, even though in most applications intensity features will vary from camera to camera based on camera viewing perspective—one camera views the front of the object being tracked while the other views the back of the object being tracked. This methodology may work well in simple environments with a limited number of cameras, but will likely not work well in complex environments and/or environments with a large number of cameras.
Robert B. Boyette, in U.S. Pat. No. 5,097,328, describes a system that collects and reports information on the number of people in a queue, service time, and anticipated wait time in a queue for a bank branch. This system is limited by the fact that the average time in a queue is computed based on arrival rates and service times, not actual queue wait times, and as such is inaccurate. Also, since there is no record of individual customer activities, it is not possible generate reports with respect to a person's sequence of activities, which can be used in identifying customer behavior.
There is a therefore need for a sophisticated, yet cost effective, tracking system that can be used in many applications. For example, it has become desirable to acquire information concerning the activity of people, for example, within a scene of a retail establishment, a bank, automatic teller machines, bank teller windows, to name a few, using data gathered from analysis of video information acquired from the scene.
It is also desirable to monitor the behavior of consumers in various locations of a retail establishment in order to provide information concerning the sequence of events and decisions that a consumer makes. This information is useful in many situations, such as, to adjust the location and features of services provided in a bank, to change merchandising strategies and display arrangements; etc. Consequently, it is necessary for the system to differentiate between people in the scene and between people and other stationary and moving objects in the scene.
Given the size of these environments, a video tracking system is needed which can track the movement of objects, such as people, through multiple cameras. Moreover, this tracking system must support the capability to query track information in order to generate information that describes how people and the environment interact with one another.