1. Field of the Invention
The invention relates generally to the management and use of documents, and in particular, to improved management and use of documents which may act as agents, generating requests for information, then seeking, retrieving and packaging responses to enrich the documents while facilitating reading comprehension, understanding relationships with other documents, and content creation. In particular this invention relates to a meta-document server with user with auto-completion and auto-correction.
2. Description of Related Art
Knowledge management through document management forms an important part of the knowledge creation and sharing lifecycle. A typical model of knowledge creation and sharing is cyclical, consisting of three main steps: synthesizing (search, gather, acquire and assimilate), sharing (present, publish/distribute), and servicing (facilitate document use for decision making, innovative creativity).
Most systems consider documents as static objects that only acquire new content when acted upon by an authorized user. A user""s decision to read and modify a document, or to run a program on it which may change its contents (for example, by adding hyperlinks), is needed for the document to acquire new information. This view of the document as a passive repository leads to the current situation in which documents remain static unless a user is in front of the screen piloting the system. OpenCola Folders(trademark) offers one solution to the view of the document as a passive repository by creating folders on a user""s computer that look for a limited set of document types, according to criteria set by the user (i.e., a single purpose information retrieval system).
Both agent-based systems and content-based retrieval systems provide some management of information without user intervention. An agent is a software program that performs a service, such as alerting the user of something that needs to be done on a particular day, or monitoring incoming data and giving an alert when a message has arrived, or searching for information on electronic networks. An intelligent agent is enabled to make decisions about information it finds. Both such systems, however, consider documents to be fixed and static entities.
Many products provide various solutions for individual aspects of the overall problem of knowledge management: anticipatory services, unstructured information management, and visualization of information and knowledge. Watson, for example, from the InfoLab at the University of Northwestern, is a program which operates while a user is creating a document. Watson retrieves information as the user works, from which the user can select for further investigation. Information retrieved by Watson comes from a service provider, and Watson stores the retrieved information in memory associated with Watson.
Also, Autonomy.com""s ActiveKnowledge(trademark) analyzes documents that are being prepared on the user""s computer desktop and provides links to relevant information. In addition, online services such as Alexa.com, Zapper.com, and Flyswat.com suggest links that are relevant to the content currently viewed highlighted in a browser window. The suggested links appear in an additional window inside or separate from the current browser window. These services treat documents as static objects. Specifically, using Zapper.com""s engine, when a user right clicks on selected text, words surrounding the selected text are analyzed to understand the context of the search request, and to reject pages that use those words in a different context.
Various products, such as commercial information retrieval systems, provide unstructured information, such as web pages, documents, emails etc. (which content may consist of text, graphics, video, or audio). Typical management services for unstructured information include: search and retrieval; navigation and browsing; content extraction, topic identification, categorization, summarization, and indexing; organizing information by automatic hyperlinking and creation of taxonomies; user profiling by tracking what a user reads, accesses, or creates create communities; etc. For example, Inxight""s parabolic tree is an example of a system that organizes unstructured information and presents it in an intuitive tree-like format.
Furthermore, it is known how to embed executable code in documents to perform certain functions at specified times. For example, European Patent Applications EP 0986010 A2 and EP 1087306 A2 set forth different techniques in which to define active documents (i.e., documents with embedded executable code). More specifically, these publications set forth that executable code within the document can be used to control, supplement, or manipulate their content. Such active documents are said to have active properties.
Notwithstanding these existing methods for statically and actively enriching document content, there continues to exist a need to provide an improved document enrichment architecture that allows ubiquitous use of document enrichment services. Such an improved document enrichment architecture would advantageously provide methods for facilitating the use of such services by automatically attaching, monitoring, and suggesting such services for users.
In accordance with one aspect of the invention, there is provided a method, and an apparatus therefor, for auto-completing document content. The method includes: receiving a signal specifying an auto-completion request that identifies an entity fragment of a target document; analyzing content surrounding the entity fragment in the target document to provide context information for identifying a first document attribute; defining a query using the entity fragment and the first document attribute; accessing a database of entities using the query to identify a set of entities that satisfy the auto-completion request, where the database of entities includes entities and entity context information that identifies a second document attribute; wherein the act of accessing compares the first document attribute and the second document attribute to determine a degree of match between the entity fragment and the entities in the database of entities.
In accordance with another aspect of the invention, there is provided a method, and an apparatus therefor, for auto-completing document content. The method includes: defining an information space for target document content; adding entities to a database of entities using the target document content and the information space for the target document content; receiving an auto-completion request that includes an entity fragment of the target document; analyzing content surrounding the entity fragment in the target document to provide associated context information; formulating a query using both the entity fragment of the target document and its associated context information; and using the query to identify a set of entities in the database of entities that satisfy the auto-completion request.
In accordance with yet another aspect of the invention, there is provided a method, and apparatus therefor, for auto-correcting document content. The method includes: defining an information space using the document content; adding entities to a database of entities using the document content and its information space; identifying errors in the document content; formulating a query using the identified errors; identifying a set of entities in the database of entities that satisfy the query; correcting the document content using the identified set of entities; and updating the information space with the corrected document content.