1. Field of Disclosure
The present disclosure generally relates to a media search and retrieval system, and, more particularly, to systems and methods for rapid retrieval of searched media files that uses a first database containing suggested search terms and associated pointers to media files in a second database for autocompleting user requests.
2. Brief Description of Related Art
Presently, many multimedia databases are available on the Internet. These databases are often used by users around the world for searching multimedia files. Many challenges exist in the field of media searching. The first challenge is the difficulty of locating a media file in a large and varied collection of multimedia files. The second challenge is the speed of locating a specific multimedia file in a large database of multimedia files. A number of retrieving systems have been established that are unable to meet these challenges. Typically these systems include mechanisms that perform a search by designating a name of a file (cat.jpg) or an image number (cat001.jpg). These systems either perform a keyword search (a search using a keyword assigned to each image in advance) or perform a full text search (a search using an arbitrary term included in the content of the media files). These text query based search systems also require an operation of designating a scope of a search, and often result in causing an enormous amount processing overhead.
For performance enhancement, classical image retrieval systems have been focused on the features of data extraction and selection, data representation and similarity measures. In recent years, some commercial products and experimental prototype systems have been successfully developed, including but not limited to: QBIC, Photobook, Virage, Visualseek, Netra and Simplicity. In the aforementioned systems, the time required for media file retrieval is primarily dependent upon database size. Thus, these systems are not suitable for large, multimedia based commercial applications. Using the aforementioned systems for searching large media file databases may be cost prohibitive. For example, keyword-based media retrieval systems may find correspondences by matching keywords from a user input to the keywords that have been manually associated with the images in the database. However, in these systems, searching media files that do not have appropriate keywords associated with them can be extremely difficult. For example, if the keywords are inaccurate searching and finding the media files can be made extremely difficult. Often “relevance feedback” techniques utilizing user feedback to understand the relevance of selected exemplary media files are employed to search such media files and to reduce inter alia searching time.
Keywords-based image retrieval systems generally find correspondences by matching keywords from a user input to the keywords that have been manually attached to the images in the database. However, some images may not have appropriate keywords to describe themselves and therefore the image search can be seriously affected. One solution is to apply “relevance feedback” techniques that utilize user feedback to gain an understanding as to the relevance of selected exemplary images and hence reduce possible errors or redundancy. For example, U.S. Pat. No. 7,181,678 (2007) teaches a method of using a Bayesian classifier technique to determine the distribution of the query space for positive hits, using feedback information to update each iteration in order to improve searching results accuracy. The major drawback of this method sacrifices searching speed as the level of computation increases with each iteration. Eigenvalue and spectral clustering methods, such as those taught by U.S. Pat. No. 6,763,137 (2004), teach rapid image searching using eigenvalues and clustering or grouping of objects for recognition purposes. Although the eigenvalue systems run relatively fast, they may compromise media retrieval accuracy. Graph based clustering methods such as those taught by U.S. Pat. No. 7,113,944 (2006), store images in a hybrid matrix, which in turn is clustered by a content-based clustering algorithm, where vector represents an image in the hybrid matrix. For each image in the matrix, a log-based document is constructed and stored in the hybrid matrix. Although this methodology has better media file retrieval accuracy, using this methodology may have an adverse impact on speed and efficiency of the media searches.
Media search user interfaces typically include an input box, a search button (can also be a “submit” or “go” button) and a display area. Searchers enter a search term in the input box, and click on the search button before search results are displayed on the display area of the user interface. Frequently, while searchers are entering a search term, search engines may present a drop down list of prospective search terms to help searchers define a search term. Searchers often select search terms from the presented list, click on search button, and review the results produced by the selected search term. During the search process, unless the selected search term produces the intended result, typically searchers move on to select a different search term. Generally a progressive search term selection process starts at a coarse phase, when user enters a partial search term, and finally leads to a refinement phase, where searcher is satisfied with the result produced by the selected search term. Typically, as searchers experiment with search terms, searchers may have to select a search term, click on a “submit” or “go” button and wait to see the results generated by the selected search term. In other words, even though searchers are able to view a dropdown list of possible search terms while searchers are entering a search term, searchers are unable to view the search results produced by a prospective search term before clicking on the “submit” or “go” button.
Currently available methods allow a user to search and retrieve media files. However, the conventional methods do not provide high processing speed, optimum use of storage space and cost efficient structure that supports rapid searching and accurate retrieval of media files. Accordingly, there is a need for an improved systems and methods supporting rapid search and accurate retrieval of the media files.