The World Wide Web (WWW) is comprised of an expansive network of interconnected computers upon which businesses, governments, groups, and individuals throughout the world maintain inter-linked computer files known as web pages. Users navigate these pages by means of computer software programs commonly known as Internet browsers. The vastness of the unstructured WWW causes users to rely primarily on Internet search engines to retrieve information or to locate businesses. These search engines use various means to determine the relevance of a user-defined search to the information retrieved.
The authors of web pages provide information known as metadata, within the body of the hypertext markup language (HTML) document that defines the web pages. A computer software product known as a web crawler systematically accesses web pages by sequentially following hypertext links from page to page. The crawler indexes the pages for use by the search engines using information about a web page as provided by its address or Universal Resource Locator (URL), metadata, and other criteria found within the page. The crawler is run periodically to update previously stored data and to append information about newly created web pages. The information compiled by the crawler is stored in a metadata repository or database. The search engines search this repository to identify matches for the user-defined search rather than attempt to find matches in real time.
Typically, each search result rendered by the search engine includes a list of individual entries that have been identified by the search engine as satisfying the user's search expression. Each entry or “hit” includes a hyperlink that points to a Uniform Resource Locator (URL) location or web page. In addition to the hyperlink, certain search result pages include a short summary or abstract that describes the content of the web page.
A common technique for accessing textual materials on the Internet is by means of a “keyword” combination, generally with Boolean operators between the words or terms, where the user enters a query comprised of an alphanumeric search expression or keywords. In response to the query, the search engine sifts through available web sites to match the words of the search query to words in a metadata repository, in order to locate the requested information.
This word match based search engine parses the metadata repository to locate a match by comparing the words of the query to indexed words of documents in the repository. If there is a word match between the query and words of one or more documents, the search engine identifies those documents and returns the search results in the form of HTML pages.
Furthermore, not only is the quantity of the WWW material increasing, but the types of digitized material are also increasing. For example, it is possible to store alphanumeric texts, data, audio recordings, pictures, photographs, drawings, images, video and prints as various types of digitized data. However, such large quantities of materials are of little value unless it the desired information is readily queryable, browseable and retrievable in an acceptably short period of time. While certain techniques have been developed for accessing specific types of textual materials, these techniques are at best moderately adequate for accessing graphic, audio or other specialized materials. Consequently, there are large bodies of published materials that still remain inaccessible and thus unusable or significantly underutilized.
As a result, with the proliferation of the Internet, it is becoming increasingly important to enable users to search the World Wide Web (WWW) by content and context, and not be limited to textual searches. More specifically, given a sample object, the problem of finding similar objects from a large collection of objects is called content-based object querying and retrieval. However, similarity among objects is subjective, and in the case of images, visual similarity comprises matching color, shape, and texture features.
Traditional methods for solving the above problem typically transform each image into one or more “signatures” pertaining to the color, shape, and texture of the images. Each image is effectively mapped to some d-dimensional points representing the features of the image and stored into an index for fast search and retrieval. Given a query image, the same transformation is applied to it, extracting its feature vectors, after which the index is queried for all points (i.e., images) that are within a certain distance from the query feature vectors. The distance measure used is typically the Euclidean distance between the points in the d-dimensional space, which is difficult to interpret intuitively, and may not be too meaningful to the user.
Another traditional method uses the Euclidean distance to match regions, and then approximates the total matched area between two images. The final similarity measure for two images is computed as the fraction of matched versus total image area in the two images. The first type of similarity measures (i.e., the Euclidean distance) has no visual meaning and is difficult to interpret for the user. The second approach of using matched area as a similarity measurement is more intuitive but is difficult to compute.
As an illustration, in the context of Internet shopping (or e-commerce) applications, if a user is shopping for a particular item, and the search result provides a list browseable images (or digital pictures), each of these images can be downloaded for example, in 10 seconds. As a result, it could take a shopper about 10 minutes to browse 60 such images. By contrast, when shopping in a retail store, the shopper is capable of visually scanning and comparing 60 substantially similar items in a fraction of the browsing time.
Such a delay in browsing Internet images could undermine the convenience of online shopping, and may lead to lost opportunities. There is therefore a still unsatisfied need for a system, method, and computer program product for improving the conventional design applications of content-based object querying by improving the image similarly search performance of search engines.