1. Field of the Invention
The present invention relates to knowledge retrieval, management and processing on the world wide web and intranets. In particular, the present invention relates to personalizing, organizing and managing information on the world wide web and intranets.
2. Discussion of the Related Art
Users of the world wide web (xe2x80x9cwebxe2x80x9d) suffer information overload. The web has no aggregate structure for organizing information into distinct web localities nor does a user have a global view of the entire Web from which to effectively retrieve relevant pages. In fact, a recent survey of 11,7 00 web users indicates that 30.31% of the surveyed users report encountering problems in xe2x80x9cfinding known information.xe2x80x9d In the same survey, 27.80% and 12.16% of the surveyed users report, as significant problems, organizing collected information and finding pages already visited, respectively.
Another study focused on bookmark usage indicates that most users gradually build a small sized archive. 68% of the surveyed users have 11 to 100 bookmarks and over 93% of the surveyed users create 0 to 5 bookmarks in each browsing session. The study also found that a larger archive requires a more sophisticated organization, such as automatically classifying bookmarks according to the contents of the documents they mark. An empirical study on users"" patterns of revisiting web pages found that 58% of the web pages a typical individual accesses are revisits.
These studies suggest a need for a tool that allows a user to build and organize a large collection of bookmarks than he or she can reasonably manually maintain now.
The present invention provides a bookmark system having access to a computer network. Such a bookmark system includes (a) an interface to the computer network; (b) a database management system; and (c) a bookmark management system coupled to the database and the interface. In the bookmark system, the bookmark management system creates and maintains in the database document records (xe2x80x9cbookmarksxe2x80x9d) containing information for locating document in the computer network, and retrieves documents, when needed, from the computer network over the interface.
According to one aspect of the invention, the bookmark system includes a document classification system for associating documents of the bookmark system into one or more categories. The classification system may access a classifier program on the computer network through the interface. The bookmark system accesses the computer network through a proxy server. In one embodiment, the database system accesses a lexical dictionary for retrieving a list of keywords that relate to a document. The proxy server can be used to monitor an access pattern for a document and the record identity of the user accessing the document.
According to another aspect of the present invention, the bookmark system classifies a document into one of many categories, each category being a leaf nodes of a hierarchical classification or navigation tree. In one embodiment, each category preferably include less than a predetermined number of documents. When the number of documents in an existing node exceeds the predetermined number of documents, the existing node is split into child nodes. Conversely, the child nodes of a parent node in the navigation tree are merged with the documents in the child nodes sum to less than the predetermined number.
According to another aspect of the present invention, the bookmark management system associates one or more user-specific records to each document record with a user-specific record, and one or more owner-specific records to each document record. The owner-specific records allow the owner of each bookmark to specify whether or not the bookmark is to be shared, thereby implementing access control. More than one owner-specific or user-specific record can be associated with a single document record. The bookmark management system needs only store one bookmark per document. In addition, the bookmark management system can present to a user a customized view of the bookmark.
In accordance with another aspect of the invention, the bookmark system automatically creates a bookmark for a user or for the system when a document is accessed at a high enough frequency over a period of time. In one embodiment, the xe2x80x9cconnectednessxe2x80x9d of a document (i.e., the number of links into the document and referred by the document) provides a measure to assist in selecting bookmarks to include automatically. The xe2x80x9cpopularityxe2x80x9d of a document, i.e., the percentage of users accessing a document, is also used to assist selection and ranking.
Alternatively, the bookmark system allows collection of documents by xe2x80x9ccrawlingxe2x80x9d. In one embodiment, parameters specified for crawling include the number of levels of links followed from a document. The bookmark system can calculate an estimated time based on the number of links. In addition, the bookmark system retrieves and presents to the user sample documents for user consideration prior to completing the crawling request. The bookmark system allows a crawling request to be limited to the number of levels of links to traverse from a seed document. Also, the crawling request can be limited to within a specified domain.
According to another aspect of the present invention, the bookmark system provides an efficient database management system that includes folders, in addition to document records. In that database system, records are related to each other by pointers, so as to facilitate database operations. The operations of the bookmark management system are achieved by traversal of pointers to document records and folders. For example, when a page has an access pattern satisfying certain predetermined criteria, the bookmark management system can include a bookmark to the page in a special purpose folder by simply associating the folder with a pointer. Such folders can include, deletion folders, hot link folders, etc. Subscription folders can also be set up, which periodically or by incremental search provides new or updated information for selected bookmarks. The subscribing users are notified when new or updated information is available.