Search is a key enabling technology for large-scale information systems, including the internet and the World Wide Web. Without search engines, such as Google search and Microsoft's Bing search engine, finding desired information and other content on the web would be almost impossible.
While text-based search engines have reached a high level of maturity, other forms of searching remain less well-developed. In particular, image-based searching is currently relatively limited and unsophisticated. Google image search (Google, Inc; CA, USA) attempts to identify images on the web based upon text queries provided by a user. For example, entering the words ‘white rabbit’ into Google image search returns many images of white rabbits that Google's automated crawlers have identified on the web. However, the association between text and image is inferred from text and metadata within the web page(s) in which the image has been found. Since this contextual information may not always accord with the primary feature within an accompanying image, an image search for ‘white rabbit’ may return images of objects other than white rabbits, images in which a white rabbit is not the main feature, or images of products sold under a ‘white rabbit’ brand name. Furthermore, there may be images of white rabbits on the web that cannot be identified as such by the surrounding text and/or associated metadata.
A ‘reverse image search’, or ‘search-by-image’ function, allows a user to supply their own image, such as a photograph, which is then used to identify similar images held within the search engine's indexed database. Google provides such a feature as part of its image search interface, as well as in an app known as Google Goggles. Another well-known reverse image search engine is TinEye (Idee Inc; Toronto).
A search-by-image engine generally aims to identify the ‘most similar’ image within its database, according to some (typically proprietary) similarity measure. Thus, for example, if the input search image is a photograph of the Eiffel Tower taken from the opposite end of the Champ de Mars, the closest match will be a similar image, captured from around the same location, at a similar time of day and year, and under similar weather conditions. The associated descriptive text, drawn from the context of the web page in which the image was found, may well be ‘Eiffel Tower’. However, it might alternatively be ‘Paris in the spring’, ‘19th century civil engineering’, or simply ‘France’.
The difficulty with this type of search-by-image feature is therefore that it does not necessarily provide the result desired by the user which, in many cases, is an accurate description of an object that appears within an image. This poses two problems that are not addressed by prior art search-by-image engines such as Google and TinEye.
The first problem arises in the need to process images to extract key identifying features of a specific object contained within the image. While the World Wide Web provides a rich corpus of images, many of these are not conducive to this type of analysis. They may be, for example, abstract images, artworks, landscapes, or other images that contain numerous different objects and features, none of which is particularly prominent or uniquely associated with a suitable description of the image.
The second problem is that of providing appropriate text that accurately describes an object appearing in an input image.
It is an object of the present invention to address these problems, and thus to provide an image search engine configured to provide improved object recognition and description, at least when compared with prior art search-by-image engines such as Google reverse image search and TinEye.