The Internet as a network of connected computers has existed for several decades, but more recently a graphical interface to the Internet was widely adopted in the mid-1990s. The interface uses hypertext markup language documents (HTML) as a base structure and distributes these documents using hypertext transfer protocol (HTTP). This relatively intuitive interface is now known as the World Wide Web, and it has allowed many companies and individuals to provide information to a wide audience. Extensions have also been made to this architecture, such as Java and Active Server Pages, to provide web pages that are more dynamic.
This simple and powerful medium for distributing information has been adopted by many companies or entities in order to provide information, documents, multi-media presentations, and similar resources for their clients, customers, and product users. The desire to deliver a large volume of content has resulted in the creation of knowledge repositories containing thousands of documents relating to a company's products, product support and similar information. To support content organization, management, and delivery, many vendors provide portal content and document management tools to entities that need such services. These portal and document management tools typically include software to organize and format content, publish content, create user sessions, and provide a user interface.
The use of knowledge repositories allows entities to deliver content to users in a speedy and effective fashion. For example, many companies have used knowledge repositories as a support tool for technical computer issues. If a user has a technical problem with their computer, the user can access the computer vendor's website and retrieve information from the knowledge repository to aid the user in fixing the problem. Frequently, companies are able to reduce support costs by allowing users to access such support pages. Many large companies such as Hewlett-Packard, Microsoft, IBM and others have used these tools to reduce computer support costs. Because technical support consumes a large amount of a technical company's resources, the effective delivery of relevant support content is valuable. The more quickly a user can find and use relevant content, the more satisfied the customer will be.
Usefulness of information is a subjective notion and it is difficult to determine. Individual users may not use the same criteria to evaluate whether a document is useful or answers specific user questions. Gathering information on the usefulness of online content is a challenging task for an information retrieval system or knowledge repository.
Users who desire to access documents located in a knowledge repository or web page collection typically access the pages through a corporate portal or similar website. In order to find useful documents, the users can use a search engine to query the knowledge repository.
As knowledge repositories are used more extensively, the size of the knowledge repositories and their document databases grows. This is because more documents are added to the database. A drawback to the growth of these types of databases is that users may find it more difficult to identify relevant documents that apply to their problems or needs. Search results can be diluted, especially if the user does not enter a well-focused search that brings up relatively related documents. This is because there may be a large number of other unrelated documents that are brought up by the search. Thus, it can be difficult to identify which documents are most relevant to a problem or piece of information the user desires to find.
If a document or knowledge management system can identify documents that are more relevant to users, then the system can increase the search ranking for documents found through the search engine. In order to increase search rankings for useful documents, the system tries to identify documents which are more relevant or related to common issues identified by users. Conversely, less relevant documents will be used less frequently and those documents receive a lower ranking. The less relevant documents should not generally be shown as a higher priority search result than useful documents even if they match the search criteria being provided by the user.
One of the methods knowledge management systems currently use to identify useful documents is tracking the number of times a document is opened. This helps the system know which documents are being opened the most. Tracking the number of times a document is opened assumes each time a document is opened that users are using or reading the document. On the other hand, documents that are rarely opened are considered less useful and may be reduced in priority in any search results provided to the user. One problem with this system is a user can open a document and decide that the document is not relevant. Then the user may immediately close the document but the event will still be registered in the document's hit count, thereby making the document appear more relevant.
Another direct way to capture the usefulness of a document is to ask users to provide feedback after reading a document. However, users are reluctant to provide their feedback. Typically, users do not feel they have time to provide specific feedback on documents. In addition, direct feedback information is sketchy at best because the system cannot identify the competency of individuals giving feedback and the size of the population sample is not controllable.