The automatic classification of images has become increasingly important as the number of images provided by web pages increases. The classification of images has many different applications. For example, a search engine service that provides image searching may attempt to classify images to make searching both more efficient and more effective. The search engine service may classify images into a hierarchy of image classifications (e.g., geography, North America, United States, and so on). The image search engine service may allow a user to specify both a search request (or query) and classifications of the images of interest (e.g., a query of “sunset” and a classification of “North America”). The image search engine service can then limit its searching to images within those specified classifications. Another example where classification of images may be helpful is a web marketplace. A web marketplace system may allow many different retailers to advertise and sell their products. The retailers may provide a database of their products, which may include, for each product, pricing information, description of the product, and the image of the product. Different retailers may describe the products in different ways so that it is difficult for the marketplace system to properly classify the products that are available for sale. If the marketplace system were able to effectively identify a classification by analyzing the image of the product, the marketplace system could use that classification to help classify the product.
Many different techniques have been applied to classifying images. Some techniques classify images based on text that is near the image. For example, a web page may include a title of the image and descriptive text. The accuracy of such techniques depends not only on the ability to accurately identify the title and associated descriptive text but also on the accuracy of the title and descriptive text in representing the image. Because of the wide variety of web page formats, it can be difficult to identify text relating to an image. Also, the text relating to an image may give very little information to help with classification. Moreover, such techniques are not particularly useful for a marketplace system when the various retailers use incomplete, ambiguous, and incorrect descriptions. Other techniques classify images based on the content of the image itself. Such techniques are referred to as content based image retrieval (“CBIR”) systems. CBIR systems attempt to classify images based on characteristics such as color, shape, and texture. Unfortunately, the precision of CBIR systems has been unsatisfactory because it is difficult to identify a classification from the low-level characteristics of an image.