A content provider can store information that will be made available to content readers. For example, a financial company might store hundreds of thousands of documents (e.g., investment reports, stock charts, and market predictions) that will be made available to customers via a Web site.
The content provider may also want to provide a content reader with information that will likely be of interest to that particular content reader. For example, one content reader may be interested in accessing documents associated with one industry while another content reader is interested in accessing documents associated with another industry.
To facilitate a content reader's ability to access information that will likely be of interest, it is known that a content provider can categorize information. For example, a content provider can associate a document with one or more “key” words. Similarly, a content provider can categorize information such that documents associated with one category (e.g., an “Automotive Industry” category) are associated with one branch of a directory structure while documents associated with another category (e.g., an “Airline Industry” category) are associated with another branch. In this way, a content reader can navigate through the directory structure and locate information that will likely be of interest.
There are a number of disadvantages, however, with such an approach. For example, a content provider may not be able to review a large number of documents in order to determine how each document should be classified (e.g., when thousands of documents are generated each day). This may be particularly difficult when the documents are associated with investment research due to the large number of potential types of investments, the frequency at which this kind of information changes (e.g., daily, weekly, or occasionally), and the importance of providing such information to customers in a timely manner.
Moreover, a content provider may receive documents from a number of different content publishers (e.g., authors associated with different companies or different departments within a company)—and each of these content publishers may categorize information in different ways. As a result, it can be difficult to determine how documents received from a first content publisher relate to documents received from a second content publisher.
Another problem arises when a single document is associated with a number of different categories. For example, a market report might be associated with both a “Technology” category and an “Application Software” category. In this case, a content provider or content publisher could inadvertently fail to include the document in both categories. For example, an author might indicate that his or her market report is associated with the “Application Software” category without realizing that the market report should also be associated with the “Technology” category.
In addition, it is possible that a document will be closely related to some categories while only being somewhat related to other categories. In this case, it can be difficult to provide a content reader with information that is especially likely be of interest to that particular content reader. For example, a content reader who is only interested in receiving documents associated with a particular country could receive a large number of documents that are only somewhat related to that country.