1. Field of the Invention
Embodiments of the invention provide a long-term memory used to store and retrieve percepts in a video analysis system. More specifically, embodiments of the invention relate to techniques for programmatically associating, learning, and recalling patterns of behavior depicted in a sequence of video frames.
2. Description of the Related Art
Some currently available video surveillance systems are provide simple object recognition capabilities. For example, a video surveillance system may be configured to classify a group of pixels in a given frame having certain specified characteristics (referred to as a “blob”) as being a particular object (e.g., a person or vehicle). Once identified, a “blob” may be tracked from frame-to-frame in order to follow the movement of the “blob” over time, e.g., a person walking across the field of vision of a video surveillance camera. Further, such systems may be able to determine when an object has engaged in certain predefined behaviors.
However, such surveillance systems typically require that the objects and/or behaviors which may be recognized by the system to be defined in advance. Thus, in practice, these systems simply compare recorded video to predefined definitions for objects and/or behaviors. In other words, unless the underlying system includes a description of a particular object or behavior, the system is generally incapable of recognizing that behavior (or at least instances of the pattern describing the particular object or behavior). Thus, to recognize additional objects or behaviors, separate software products may need to be developed. This results in surveillance systems with recognition capabilities that are labor intensive and prohibitively costly to maintain or adapt for different specialized applications. For example, monitoring airport entrances for lurking criminals and identifying swimmers who are not moving in a pool are two distinct situations, and therefore may require developing two distinct software products having their respective “abnormal” behaviors pre-coded. Thus, currently available video surveillance systems are typically incapable of recognizing new patterns of behavior that may emerge in a given scene or recognizing changes in existing patterns. Further, such systems are often unable to associate related aspects from different patterns of observed behavior, e.g., to learn to identify behavior being repeatedly performed by a criminal prior to breaking into cars parked in a parking lot.
Further, the static patterns that available video surveillance systems are able to recognize are frequently either under inclusive (i.e., the pattern is too specific to recognize many instances of a given object or behavior) or over inclusive (i.e., the pattern is general enough to trigger many false positives). In some cases, the sensitivity of may be adjusted to help improve the recognition process, however, this approach fundamentally relies on the ability of the system to recognize predefined patterns for objects and behavior. As a result, by restricting the range of objects that a system may recognize using a predefined set of patterns, many available video surveillance systems have been of limited usefulness.