The present invention generally relates to data processing. The invention relates more specifically to managing directories of electronic documents that are used, for example, in a large hypertext document system.
Hypertext systems are widely used. One particular hypertext system, the World Wide Web (xe2x80x9cWebxe2x80x9d), provides global access over public packet-switched networks to a large number of hypertext documents. The Web has grown to contain a staggering number of documents, and the number of documents continues to increase. The Web has been estimated to contain several hundred million pages and is expected to expand rapidly over the foreseeable future.
The number of documents available through the Web is so large that to use the Web in a practical way almost always requires a search service, search engine, or similar service. The search engines use xe2x80x9cspiderxe2x80x9d programs that xe2x80x9ccrawlxe2x80x9d to Web servers around the world, locate documents, index the documents, and follow hyperlinks in those documents to yet other documents. In one mode of operation, a user provides a search query to the search engine, which locates information in the index about responsive documents and displays a set of search results that satisfy the query.
In another mode of operation, the index may be organized in the form of a hierarchical directory that is structured using a taxonomy of categories. The search engine displays a top-level set of categories from the taxonomy. Each category is hyperlinked to one or more subordinate categories that are associated with that category. Each category may also be associated with one or more documents that fall within that category. In that case, the search engine also displays a list of the documents. A user may browse the taxonomy and the underlying directory by selecting successive categories until a category of interest is reached, or may select a document associated with a particular category. This mode of operation is available using the search engine xe2x80x9cDisney""s Internet Guidexe2x80x9d (DIG), which is accessible online at www.dig.com.
In this field, separate enterprises or companies may own and operate the search engine and the technology that is used to create and manage the underlying directory (the xe2x80x9cmaster directoryxe2x80x9d ). Different search engine operators may wish to provide a taxonomy to their end users that is different from the standard taxonomy that is reflected by the master directory. Also, various search engine operators may wish to classify a particular document in a category that is different from that in which it is classified in the master directory. Thus, there is a need to enable different search engine operators to establish different, customized underlying directories.
A search engine operator with a customized directory, however, normally will still want to receive updates to the master directory issued by the owner of the directory technology. Thus, there is a need to integrate the updates into the customized directory in a way that does not override or disrupt the customizations that are reflected in the customized directory.
There is also a need for a system or mechanism that provides a convenient way to create and store a custom directory based on a master directory, in which the custom directory has a custom taxonomy that is different from the taxonomy of the master directory.
There is also a need for such as system or mechanism in which the custom taxonomy classifies documents in different categories or according to custom judgements that differ from or conflict with corresponding judgements in the master directory.
There is also a need to integrate the updates into the customized directory in a way that does not override or disrupt the custom judgements that are reflected in the customized directory.
The foregoing needs and objects, and other needs and objects that will become apparent from the following description, are achieved by the present invention, which comprises, in one aspect, a method of managing changes to a directory of electronic documents, comprising the steps of creating and storing a first directory of the electronic documents having a hierarchy of one or more categories into which one or more of the electronic documents are classified; creating and storing a second directory that is based on the first directory, one or more customizations that represent differences between hierarchies of the first directory and the second directory, and one or more judgements that represent whether one or more of the electronic documents are properly classified in the categories; modifying the first directory by changes to one or more of the categories; and automatically propagating the changes to the second directory, without modifying the customizations or the judgements, to thereby create and store a modified second directory.
One feature of the invention involves creating and storing the first directory of the electronic documents that is defined by a hierarchy of one or more categories into which one or more of the electronic documents are classified and by one or more master judgements that represent whether one or more of the electronic documents are properly classified in the categories; modifying the first directory by changes to one or more of the master judgements; and automatically propagating the changes to the second directory, without modifying the customizations or the judgements, to thereby create and store a modified second directory.
Another feature relates to creating and storing one or more judgements of the second directory by the steps of displaying a taxonomy of categories of the second directory; receiving a selection of one of the categories; displaying one or more judgements associated with one of the electronic documents in the selected category; and receiving and storing a quality value that defines how closely the one of the electronic documents matches the selected category. In another feature, creating and storing the second directory includes creating and storing the second directory based on the first directory and one or more customizations that reflect a merge of a plurality of the categories of the first directory to one of the categories of the second directory.
According to another feature, creating and storing the second directory includes the step of creating and storing the second directory based on the first directory and one or more customizations that reflect a split of one of the categories of the first directory into a plurality of the categories of the second directory. A related feature involves creating and storing a new judgement in the second directory, wherein the new judgement indicates that a particular electronic document is in one of the categories of the second directory. Another feature is that the new judgement indicates that a particular electronic document is not in one of the categories of the second directory. Still another feature is that the new judgement indicates that a particular electronic document is locked out of all categories of the second directory.
In another feature, the method comprises marking the new judgement as un-reviewed; and receiving an acceptance signal indicating that the new judgement is accepted, and in response thereto, persistently storing the new judgement in the second directory. A related feature involves receiving a rejection signal indicating that the new judgement is rejected, and in response thereto, modifying the new judgement to indicate that the electronic document is not in the category.
According to still another feature, the method includes integrating the customizations and judgements into the second directory by identifying in the customizations to the first directory, each mapping of a source category of the first directory to a destination category of the second directory; copying each judgement that is in the source category to the destination category; and marking as un-reviewed, each judgement that is copied to the destination category and that originates from a split mapping of the source category.
In another feature, the method marks as un-reviewed, each judgement that is copied to the destination category and that originates from an un-reviewed judgement in the source category.
According to another feature, the method further comprises the step of creating and storing one or more custom judgements that are associated with the second directory, wherein the custom judgements have a judgement type value selected from among xe2x80x9cin category,xe2x80x9d xe2x80x9cnot in category,xe2x80x9d xe2x80x9cexclude from all categoriesxe2x80x9d, and xe2x80x9cundo exclude from all categoriesxe2x80x9d; and integrating each of the custom judgements into the second directory by overriding any conflicting judgement originating from the first directory. A related feature is that the integrating is carried out such that: judgements of lower priority cannot affect earlier judgements of higher priority; xe2x80x9cinxe2x80x9d judgements override previous xe2x80x9cinxe2x80x9d and xe2x80x9cnot inxe2x80x9d judgements; xe2x80x9cnot inxe2x80x9d judgements override previous xe2x80x9cinxe2x80x9d and xe2x80x9cnot inxe2x80x9d judgements; xe2x80x9cexclude from all categoriesxe2x80x9d judgements override previous xe2x80x9cinxe2x80x9d judgements; and xe2x80x9cundo exclude from all categoriesxe2x80x9d judgements override previous xe2x80x9cexclude from all categoriesxe2x80x9d judgements.
The invention also encompasses an apparatus and a computer-readable medium that may be configured to implement the foregoing.