Information is contained in objects or resources, which are available for access in a database system. The objects or resources may be in the form of documents, web sites, images, video, audio or other forms of media and will have structured or non-structured descriptions associated with them. Database systems from which the objects may be retrieved may be computer networks with data stored across the network on a plurality of connected computer servers. One example of such a network is the Internet.
The Internet has resulted in new challenges for information retrieval due to the extremely high volume of information contained on web sites and web pages forming the World Wide Web. Users can locate information in objects by using the URL (Uniform Resource Locator) of an object such as a web site, which is the address, which specifies the protocol to be used in accessing the object. If a user does not know a URL, a search must be carried out to locate the object or objects of interest to the user. Searches are usually carried out by using keywords derived from authored descriptions of the objects or resources.
Search engines are programs that search for keywords and phrases in the objects themselves or in descriptions of objects. On the Internet, a search engine program may search a single web site or may search across many sites using agents such as spiders to gather lists of available files and documents and store these lists in databases that users can search by keyword. Most search engines reside on a server.
Objects are categorised or formatted at the time of storage in order for the information in the object to be available for search and retrieval. Many search engines and processes rely on authored information to classify objects. The authored information may be provided by the creator of the object or by a dedicated classifier.
A user wishing to access these objects submits a search term to a search engine, which then returns pointers to the objects in the form of links, which allow the objects to be viewed or retrieved.
Automated search engines that rely on keyword matching often return too many low quality matches. The success of the search relies on the user entering search terms, which closely match those already associated with the object and known by the search engine. If the user does not enter the correct terms, then the objects are not identified.
Users of search engines may come from different disciplines such as advertising, technical or marketing disciplines. Users may even be accessing objects for personal use such as objects relating to hobbies. Users will understand vocabularies, which may be different from those used in classifying the objects. The information contained within the objects may still be of interest and relevant to the user but the user has not achieved successful search hits.
Often information may be accessed by serendipity. An object of information may be discovered by luck whilst browsing or a user may have a different understanding of an object's classification.