This invention relates to a method and apparatus for clustering color data in video sequences, and more particularly to color clustering methods for detecting video sequence scene changes and color clustering methods for tracking objects in a video sequence.
Conventional methods for clustering color data of an image are based on block truncation and vector quantization. Block truncation of image data is a coding process in which significant visual features are retained while other data is discarded. Vector quantization is a process for mapping a sequence of vectors into a digital sequence suitable for communication over a digital channel and for storage on a digital media. In a typical block truncation implementation an image is divided iteratively to achieve an optimal number of component subimages. Classification criterion is used to truncate the data within a window. Each color class within a window is represented by a corresponding mean color vector. A linear vector quantizer determines a mapping of the subimage pixels.
Another method for clustering data is found in pattern learning and recognition systems based upon adaptive resonance theory (ART). Adaptive resonance theory, as coined by Grossberg, is a system for self-organizing stable pattern recognition codes in real-time data in response to arbitrary sequences of input patterns. (See "Adaptive Pattern Classification and Universal Recoding: II . . . ," by Stephen Grossberg, Biological Cybernetics 23, pp. 187-202 (1976)). It is based on the problem of discovering, learning and recognizing invariant properties of a data set, and is somewhat analogous to the human processes of perception and cognition. The invariant properties, called recognition codes, emerge in human perception through an individual's interaction with the environment. When these recognition codes emerge spontaneously, as in human perception, the process is said to be self-organizing.
It is desirable that neural networks implementing adaptive resonance theory be capable of self-organizing, self-stabilizing and self-scaling the recognition codes in response to temporal sequences of arbitrary, many-input patterns of varying complexity. A system implementing adaptive resonance theory generates recognition codes in response to a series of environmental inputs. As learning proceeds, interactions between the inputs and the system generate new steady states and basins of attraction. These steady states are formed as the system discovers and learns critical feature patterns that represent invariants of the set of all experienced input patterns. This ability is referred to as plasticity. The learned codes are dynamically buffered against relentless recoding due to irrelevant inputs. The buffering process suppresses possible sources of instability.
Adaptive Resonance Theory-1 (`ART 1`) networks implement a set of differential equations responsive to arbitrary sequences of binary input patterns. Adaptive Resonance Theory-2 (`ART 2`) networks self-organize stable recognition categories in response to arbitrary sequences of not only binary, but also analog (gray-scale, continuous-valued) input patterns. See "ART 2: Self-Organization of Stable Category Recognition Codes for Analog Input Patterns," by Gail A. Carpenter and Stephen Grossberg. To handle arbitrary sequences of analog input patterns, ART 2 architectures employ a stability-plasticity tradeoff, a search-direct access tradeoff and a match-reset tradeoff. Top down learning expectation and matching mechanisms are significant features in self-stabilizing the code learning process. A parallel search scheme updates itself adaptively as the learning process unfolds. After learning stabilizes, the search process is disengaged. Thereafter input patterns directly access their recognition codes without any search. A novel input pattern can directly access a category if it shares invariant properties with a set of exemplars of that category. A parameter called an attentive vigilance parameter determines how fine the categories are to be. If vigilance decreases due to environmental feedback, then the system automatically searches for and learns finer recognition categories. If vigilance increases due to environmental feedback, then the system automatically searches for and learns coarser recognition categories.
An ART-2 network is modified to achieve an inventive clustering and pattern recognition system of this invention.