The present invention relates to a method and apparatus for representing a colour image or a region of an image for searching purposes, and a method and apparatus for searching for colour images or image regions.
Searching techniques based on image content for retrieving still images and video from, for example, multimedia databases are known. Various image features, including colour, texture, edge information, shape and motion, have been used for such techniques. Applications of such techniques include Internet search engines, interactive TV, telemedlcine and teleshopping.
For the purposes of retrieval of images from an image database, images or regions of images are represented by descriptors, including descriptors based on colours within the image. Various different types of colour-based descriptors are known, including the average colour of an image region, statistical moments based on colour variation within an image region, a representative colour, such as the colour that covers the largest area of an image region, and colour histograms, where a histogram is derived for an image region by counting the number of pixels in the region of each of a set of predetermined colours.
A known content-based image retrieval system is QBIC (query by image content) (see U.S. Pat. No. 5,579,471, MPEG document M4582/P165: Colour Descriptors for MPEG-7 by IBM Almaden Research Center). In one of the modes of operation of that system, each image in a database is divided into blocks. Each block is grouped into subsets of similar colours and the largest such subset is selected. The average colour of the selected subset is chosen as the representative colour of the respective block. The representative colour information for the image is stored in the database. A query in the database can be made by selecting a query image. Representative colour information for the query image is derived in the same manner as described above. The query information is then compared with the information for the images stored in the database using an algorithm to locate the closest matches.
MPEG document M4582/P437 and U.S. Pat. No. 5,586,197 disclose a similar approach, but using a more flexible method of dividing an image into blocks and a different method of comparing images. In another variation, described in MPEG document M4582/P576: Colour representation for visual objects, a single value for each of two representative colours per region are used.
Several techniques for representing images based on colour histograms have been developed such as MPEG document M4582/P76: A colour descriptor for MPEG-7: Variable-Bin colour histogram. Other techniques use statistical descriptions of the colour distribution in an image region. For example, MPEG document M4582/P549: Colour Descriptor by using picture information measure of subregions in video sequences discloses a technique whereby an image is divided into high and low entropy regions and colour distribution features are calculated for each type of region. MPEG document M4852/P319: MPEG-7 Colour Descriptor Proposal describes using a mean and a covariance value as descriptors for an image region.
All the approaches described above have important shortcomings. Some of them, in particular colour histogram techniques, are highly accurate, but require relatively large amounts of storage and processing time. Other methods, such as the ones using one or two representative colours, have high storage and computational efficiency but are not precise enough. The statistical descriptors are a compromise between those two types of techniques, but they can suffer from lack of flexibility, especially in case where colours of pixels vary widely within a region.
The present invention provides a method of representing an image by approximating the colour distribution using a number of component distributions, each corresponding to a representative colour in an image region, to derive descriptors of the image region.
The invention also provides a method of searching for images using such descriptors.
The invention also provides a computer program for implementing said methods and a computer-readable medium storing such a computer program. The computer-readable medium may be a separable medium such as a floppy disc or CD-ROM or memory such as RAM.