The present invention relates to video data processing, and more particularly to a process for extracting regions of homogeneous texture in a digital picture.
Extraction of semantically meaningful visual objects from still images and video has enormous applications in video editing, processing, and compression (as in MPEG-4) as well as in search (as in MPEG-7) applications. Extraction of a semantically meaningful object such as a building, a person, a car etc. may decomposed into the extraction of homogeneous regions of the semantic object and performing a xe2x80x9cunionxe2x80x9d of these portions at a later stage. The homogeneity can be in color, texture, or motion. As an example, extraction of a car could be considered as the extraction of tires, windows and other glass portions, and the body of the car itself.
What is desired is a process that may be used to extract a homogenous portion of the object based upon texture.
Accordingly the present invention provides a process for extracting regions of homogeneous texture in a digital picture based on a color texture gradient field, using either a weighted Euclidean distance between momentbased feature vectors or a pmf-based distance metric. The digital picture is divided into a plurality of blocks, and for each block a feature vector is generated as a function of the moments of the data. A gradient is extracted for each block as a function of the feature vector, the gradient being defined as the maximum distance between feature vectors of the current block and its nearest neighboring blocks, the distance metric being determined either by using the weighted Euclidean distance or the probability mass function-based distance. The resulting gradient field is smoothed by morphological preprocessing, and the preprocessed gradient field is segmented by a watershed algorithm to produce regions of homogeneous texture.