In large media management systems it is desired to categorize images that have general semantic similarity so that stored images can be efficiently and effectively retrieved. Categorizing images manually is time consuming and impractical especially where large numbers of images are being categorized and thus, techniques to automatically categorize images are desired.
Techniques for automatically categorizing images have been considered. For example, U.S. Pat. No. 5,872,865 to Normile et al. discloses a system for automatically classifying images and video sequences. The system executes a classification application that is trained for an initial set of categories to determine eigen values and eigen vectors that define the categories. Input video sequences are then classified using one of orthogonal decomposition using image attributes, orthogonal decomposition in the pixel domain and neural net based classification. A set of primitive attributes based on average bin color histogram, average luminance on intensity, average motion vectors and texture parameters is generated for frames of the video sequence. Frames of the video sequence are transformed into canonical space defined by the eigen vectors allowing the primitive attributes to be compared to the eigen values and the eigen vectors defining the categories thereby to allow the frames to be classified.
U.S. Pat. No. 6,031,935 to Kimmel discloses a method and apparatus for segmenting images using deformable contours. A priori information concerning a target object to be segmented i.e. its border, is entered. The target object is manually segmented by tracing the target object in training images thereby to train the apparatus. A search image is then chosen and a nearest-neighbour training image is selected. The traced contour in the training image is then transferred to the search image to form a search contour. The search contour is deformed to lock onto regions of the target object which are believed to be highly similar based on the a priori information and the training information. Final segmentation of the search contour is then completed.
U.S. Pat. No. 6,075,891 to Burman discloses a non-literal pattern recognition method and system for hyperspectral imagery exploitation. An object is scanned to produce an image set defining optical characteristics of the object including non-spatial spectral information and electromagnetic spectral band data. A spectral signature from a single pixel in the image set is extracted. The spectral signature is then filtered and normalized and forwarded to a material categorization system to identify categories related to the sensed data. A genetic algorithm is employed that solves a constrained mixing equation to detect and estimate the abundance of constituent materials that comprise the input spectral signature.
U.S. Pat. No. 6,477,272 to Krumm et al. discloses a system and process for identifying the location of a modelled object in a search image. Model images of the object, whose location is to be identified in the search image, are captured. Each model image is computed by generating counts of every pair of pixels whose pixels exhibit colors that fall within the same combination of a series of pixel color ranges and which are separated by a distance falling within the same one of a series of distance ranges. A co-occurrence histogram is then computed for each of the model images. A series of search windows is generated from overlapping portions of the search image. A co-occurrence histogram is also computed for each of the search windows using the pixel color and distance ranges established for the model images. A comparison between each model image and each search window is conducted to assess their similarity. The co-occurrence histograms from the model images and the search image windows are then compared to yield similarity values. If a similarity value is above a threshold, the object is deemed to be in the search window.
U.S. Pat. No. 6,611,622 to Krumm discloses an object recognition system and process that identifies people and objects depicted in an image of a scene. Model histograms of the people and objects that are to be identified in the image are created. The image is segmented to extract regions which likely correspond to the people and objects being identified. A histogram is computed for each of the extracted regions and the degree of similarity between each extracted region histogram and each of the model histograms is assessed. The extracted region having a histogram that exhibits a degree of similarity to one of the model histograms, which exceeds a prescribed threshold, is designated as corresponding to the person or object associated with that model histogram.
U.S. Pat. No. 6,668,084 to Minami discloses an image recognition method wherein search models are created that identify the shape and luminance distribution of a target object. The goodness-of-fit indicating correlation of the object for each one of the search models is calculated and the search models are rearranged based on the calculated goodness-of-fit. Object shapes are modelled as polygons and the luminance values are taken to be the inner boundaries of the polygons.
U.S. Pat. No. 6,762,769 to Guo et al. discloses a system and method for synthesizing textures from an input sample using an accelerated patch-based sampling system to synthesize high-quality textures in real-time based on a small input texture sample. Potential feature mismatches across patch boundaries are avoided by sampling patches according to a non-parametric estimation of the local conditional Markov Random Field (MRF) density function.
U.S. Pat. No. 6,922,489 to Lennon et al. discloses a method of interpreting an image using a statistical or probabilistic interpretation model. During the method, contextual information associated with the image is analyzed to identify predetermined features relating to the image. The statistical or probabilistic interpretation model is biased in accordance with the identified features.
U.S. Pat. No. 7,012,624 to Zhu et al. discloses a method for generating texture. During the method, a target patch to be filled in an image is determined and a sample patch is selected as a candidate for filling the target patch. A first difference between a first area surrounding the target patch and a corresponding first area surrounding the sample patch, and a second difference between a second area surrounding the target patch and a corresponding second area surrounding the sample patch are determined. The larger of the first difference and the second difference is multiplied with a first weight factor, and the smaller of the first difference and the second difference is multiplied with a second weight factor. The weighted first difference and the weighted second difference are summed to yield the distance between the target patch and the sample patch.
U.S. Patent Application Publication No. US2001/0012062 to Anderson discloses a system and method for analyzing and categorizing images. Analysis modules examine captured image files for selected criteria and then generate and store appropriate category tags with the images to enable desired categories of images to be automatically accessed. One analysis module analyzes the final line of image data at a red, green, blue (RGB) transition point to generate category tags. Another analysis module performs gamma correction and color space conversion to convert the image data into YCC format and then analyzes the final line of the image data at a YYC transition point to generate the category tags.
U.S. Patent Application Publication No. US2002/0131641 to Luo et al. discloses a system and method for determining image similarity. Perceptually significant features of the main subject or background of a query image are determined. The features may include color texture and/or shape. The main subject is indicated by a continuously valued belief map. The determined perceptually significant features are then compared with perceptually significant features of images stored in a database to determine if the query image is similar to any of the stored images.
U.S. Patent Application Publication No. 2002/0171660 to Luo et al. discloses a multi-resolution block sampling based texture analysis/synthesis algorithm. A reference texture is assumed to be a sample from a probability function. The synthesis of a similar, but distinctive, synthetic texture is handled by an apparatus that first estimates and then re-samples the probability function. In order to achieve good and fast estimation of the probability function for a reference texture and in order to retain the texel structural information during the synthesis, a block sampling and texture synthesis scheme based on multi-resolution block sampling is employed. A process, which integrates estimation of dominant texture direction and the synthesis algorithm is employed to handle directional textures. The dominant direction is used to orient and then control the synthesis process so as to preserve the dominant reference image direction.
U.S. Patent Application Publication No. US2002/0183984 to Deng et al. discloses a system and method for categorizing digital images. Captured images are categorized on the basis of selected classes by subjecting each image to a series of classification tasks in a sequential progression. The classification tasks are nodes that involve algorithms for determining whether classes should be assigned to images. Contrast-based analysis and/or meta-data analysis is employed at each node to determine whether a particular class can be identified within the images.
U.S. Patent Application Publication No. US2003/0053686 to Luo et al. discloses a method for detecting subject matter regions in a color image. Each pixel in the image is assigned a belief value as belonging to a subject matter region based on color and texture. Spatially contiguous candidate subject matter regions are formed by thresholding the belief values. The spatially contiguous subject matter regions are then analyzed to determine the probability that a region belongs to the desired subject matter. A map of the detected subject matter regions and associated probabilities is generated.
U.S. Patent Application Publication No. 2003/0174892 to Gao et al. discloses a technique for automated selection of a parameterized operator sequence to achieve a pattern classification task. A collection of labelled data patterns is input and statistical descriptions of the inputted labelled data patterns are then derived. Classifier performance for each of a plurality of candidate operator/parameter sequences is determined. The optimal classifier performance among the candidate classifier performances is then identified. Performance metric information, including, for example, the selected operator sequence/parameter combination, is outputted. The operator sequences can be chosen from a default set of operators, or may be a user-defined set. The operator sequences may include morphological operators, such as, erosion, dilation, closing, opening, close-open, and open-close.
U.S. Patent Application Publication No. US2004/0066966 to Schneiderman discloses a system and method for determining a set of sub-classifiers for an object detection program. A candidate coefficient-subset creation module creates a plurality of candidate subsets of coefficients. The coefficients are the result of a transform operation performed on a two-dimensional digitized image and represent corresponding visual information from the digitized image that is localized in space, frequency and orientation. A training module trains a sub-classifier for each of the plurality of candidate subsets of coefficients. A sub-classifier selection module selects certain of the sub-classifiers. The selected sub-classifiers examine each input image to determine if an object is located within a window of the image. Statistical modeling is used to take variations in object appearance into account.
U.S. Patent Application Publication No. US2004/0170318 to Crandall et al. discloses a method for detecting a color object in a digital image. Color quantization is performed on a model image including the target object and on a search image that potentially includes the target object. A plurality of search windows are generated and spatial-color joint probability functions of each model image and search image are computed. The color co-occurrence edge histogram is chosen to be the spatial-color joint probability function. The similarity of each search window to the model image is assessed to enable search windows containing the target object to be designated.
U.S. Patent Application Publication No. 2005/0047663 to Keenan et al. discloses a method that facilitates identification of features in a scene which enables enhanced detail to be displayed. One embodiment incorporates a multi-grid Gibbs-based algorithm to partition sets of end-members of an image into smaller sets upon which spatial consistency is imposed. At each site within an imaged scene, not necessarily a site entirely within one of the smaller sets, the parameters of a linear mixture model are estimated based on the smaller set of end-members in the partition associated with that site. An enhanced spectral mixing process (SMP) is then computed. One embodiment employs a simulated annealing method of partitioning hyper-spectral imagery, initialized by a supervised classification method to provide spatially smooth class labelling for terrain mapping applications. One estimate of the model is a Gibbs distribution defined over a symmetric spatial neighbourhood system that is based on an energy function characterizing spectral disparities in both Euclidean distance and spectral angle.
Although the above references disclose techniques for categorizing images, improvements are desired. It is therefore at least one object of the present invention to provide a novel method, apparatus, and computer readable medium embodying a computer program for automatically categorizing images.