1. Field of the Invention
The present invention relates to an SGML (Standard Generalized Markup Language) document managing apparatus for allowing users to collaboratively create, edit, and revise a large SGML document sequence, such as a manual.
2. Description of the Related Art
In recent years, many machine-readable documents are available; a new document can be easily created by recomposing existing machine-readable documents. Reuse of components of existing electronic documents remarkably reduces the cost of creating a document. In addition, as there is an increase in the volume and types of manuals due to the advancement of technologies and the advent of new document circulating media such as the Internet, the number of documents to be managed has explosively increased. Inevitably, needs for re-using electronic documents are becoming strong. SGML has been proposed for satisfying such needs and is used worldwide.
The SGML format is designed to provide a machine-independent way of document encoding, and to enable document interchange from one hardware and software environment to another without loss of information. In addition, the format of an SGML document can be easily recognized by computer. A part of the contents of one SGML document can be easily re-used as a part of another. Moreover, the same SGML document can be re-used in various forms.
Thus, SGML documents have been mainly used for creating and managing technical documents, such as manuals, computer aided printing, and electronic publishing. In addition, SGML documents have been widely used for fields that handle large and long-life documents such as network communications, electronic trading, and databases such as electronic libraries. For example, books and catalogs of electronic libraries have been stored and managed in the SGML format. Moreover, the HTML format used for text interchange in the Internet, a communications system that is explosively growing day by day, is a kind of the SGML format.
Since documents in the SGML format have been widely and massively used, the necessity of an apparatus that manages SGML documents is becoming strong. The main features of SGML documents are that they are of large-scale and long-life. Thus, the SGML document managing apparatus should support the life-cycle of SGML documents, such as creation, storage, and re-use thereof. The structures of SGML documents should conform to document type definitions (DTD). Thus, the apparatus also requires a document structure consistency managing system.
The present invention relates to such an SGML document managing apparatus that has a document structure consistency managing system and a document structure revision history managing system.
The following prior art technologies have been used.
(1) Technology for maintaining the consistency of an entire document that is collaboratively edited
To create a document of a large volume, the environment that enable collaborative works and re-use of existing documents for creating and editing a document are required. The SGML document managing apparatus should have a system that allows parts of a document to be independently edited while maintaining the consistency of the entire document.
Even if individual works are properly performed, they may conflict with each other and break the entire document consistency. If one work conflicts with another work that is performed along with it, its result should be saved to the original document after the conflicted parts are corrected or are discarded. (If inconsistent work results are saved, the resultant SGML document cannot be used for other SGML applications.) Thus, the work efficiency deteriorates.
To solve this problem, the conventional structured document managing apparatus uses a check-out/check-in system. In other words, every document portion to be edited is checked out so that the portion of the document is assigned a read-only state whereby the other workers cannot update this portion. After the editing work is completed, the document is checked in so that the read-only state is canceled. In the process of editing, the particular worker can update only portions that have been checked out.
The fundamental structure of a conventional SGML document managing apparatus is the same as that of the above-described structured document managing apparatus. The conventional SGML document managing apparatus uses the check-out/check-in system of document elements. It checks out a document element according to the request for editing from a worker, and allows the worker to update only the portion in lower hierarchical levels than the checked-out element. When the portion is checked in, it examines the conformance to the DTD of the entire document to maintain the consistency of the document (this operation is referred to as SGML parsing).
(2) Technology for effectively parsing an SGML document
To examine whether or not an SGML document that has been partially updated conforms to the DTD (namely, to parse the SGML document), all elements of the document including elements that are not edited should be examined. In other words, each document element should be correlated with the DTD starting with the root document element; thus as well as the result of editing the tag information (generic identifiers) of unedited parts are necessary to examine the result of partial editing. (The generic identifier (GI) is a name that represents the type of a document element. The DTD defines the arrangement of document elements that can occur in the immediately lower hierarchical level of a particular type of document element in terms of a sequence of generic identifiers.)
To simply perform this operation, the document managing apparatus that stores an SGML document in units of document elements extracts tags of document elements, at least all the tags in the document part preceding the edited portion, from a database that stores the document. However, the cost to access the database that stores the document is high. To solve this problem, in the conventional technologies, a document structure and document contents are separately stored, or every document element is stored with a tag sequence of its preceding part so that necessary tag information to parse can be quickly extracted. Thus, the cost for parsing an SGML document is reduced.
(3) Technologies for managing the revision history of an SGML document
With respect to management of a revision history of an SGML document, technologies for managing structured documents such as SGML documents have not matured. Technologies for holding changes of a structure as a revision history have not been established. Generally, in a method for managing a revision history for each document element, only the original document element of the revised document or the difference due to the revision is stored. Thus, a history of structural changes, such as changes of positions of document elements, is not held. As a related art reference for managing a history of structural changes, a "structural database system" has been disclosed as Japanese Patent Laid-Open Publication No. 6-250895. However, in this system, documents are represented in a machine-dependent format. Thus, SGML documents with history information cannot be freely interchanged.
There are two problems to be solved by the present invention.
1) Restrictions for updating a document too strong.
When a document is collaboratively edited, while keeping individual editing works correct, the consistency of the entire document should be also maintained. However, in the conventional technologies, the consistency of the entire document takes too much precedence over the independence of the individual editing works.
In the conventional technologies, only the checked-out portion of a document (namely, the contents and attributes of document elements in lower hierarchical levels than a particular document element (sequence) to be checked out) can be edited. Thus, the tag(s) of the checked-out element(s), which represents a type of document element, is prohibited from being changed. In addition, the document element(s) that has been checked out is prohibited from being divided and deleted. Only the portions that have a dependent relation in the DTD are substantially unable to be edited at the same time. Depending on the DTD, a tag may be changed independently from another work, or an element may be divided or deleted independently from another work. Without analysis of document element dependency in the DTD, these restrictions should become unnecessarily strong.
For example, when a document is collaboratively created section by section, a writer may want to divide his/her checked-out section. The restriction against doing so without analysis of the DTD prohibits this operation. The writer should stop editing, check out the document element in the one-level higher hierarchical level, and re-edit. When the editing work is suspended, the creative thoughts of a writer may be adversely affected. In addition, the document element in the higher hierarchical level may not be available to edit. When editing works are performed in a distribution environment, the document main body may be unavailable physically (because of such as a network disconnection etc.). Thus, the collaborative editing works are disturbed.
To allow an editing work for dividing a section to be performed, the dependent relation of the document structure of an element to be checked out should be obtained before the element is edited. In addition, the editing work should be restricted corresponding to the dependent relation.
An SGML document is formed in a tree structure. Sister document elements (instances), which have the same parent document element (instance), may depend on each other corresponding to a related element declaration in a document type definition. Thus, to ensure the consistency of collaborative editing works, the dependency among the sister document elements should be obtained in advance. For example, when editing an instance of a type of document element that the DTD defines as being able to occur any number of times (including zero times), the instance can be divided into elements of the same type or be deleted. However, when the DTD defines a type of document element as occurring once, the instance of the type cannot be edited in such a manner.
To determine whether instances of a type of document element defined as occurring a plurality of times (at least once) can be deleted, it is necessary to check the number of instances of the type of document element, namely instances that have the same tag, and how many of them may be deleted by another editing work. For example,
______________________________________ &lt;!ELEMENT PART(SECT)+&gt; &lt;!--PART is cornposed of one or more SECTs--&gt; &lt;PART id = p1&gt; &lt;SECT id = s1&gt;First section &lt;/SECT&gt; &lt;SECT id = s2&gt;Second section &lt;/SECT&gt; &lt;/PART&gt; ______________________________________
In this document structure, one of document elements id=s1 and id=s2 can be deleted.
Such a restriction depends on not only the DTD, but also the results of other editing works that are being executed in parallel. The conventional SGML document managing apparatus, which examines the result of partial editing using the DTD for the entire SGML document (namely, it parses the result with the original DTD), cannot handle the restrictions of this kind. In other words, to enable local parsing of a result of partial editing, it is necessary to provide another editing restricting system or another partial editing DTD.
With respect to this point, the present invention automatically creates a partial editing DTD, which represents the restriction against partial editing, to enable local parsing of the result of partial editing. When a edited portion is checked out, the partial editing DTD is created by modifying the original DTD according to the restriction calculated from the existing document structure and other editing works executing in parallel at the time.
2) Insufficient revision history control of structural changes of a document
As another problem of the conventional technologies, a sufficient history of structural changes of an SGML document cannot be held. Conventionally, a history is held as transitions of individual document elements, and a history of structural changes such as a move operation and an exchange operation of document elements cannot be accurately held.
For example, the conventional revision history controller recognizes a divide operation of a document element as a partial delete operation of a document element and an append operation of a new document element. In addition, it recognizes an exchange operation of a document element as two sets of a delete operation and an append operation. Alternatively, it may record the histories of the divide operations, the exchange operations, or the like simply as the difference between the original document and the resultant document after these update operations. Although these update histories can recompose the resultant document from the original document, they cannot represent what the writer(s) intended to do exactly. Thus, the reliability of the updated history of the divided document elements and exchanged document elements decreases.
An apparatus that manages a history of structural changes has been proposed as Japanese Patent Laid-Open Publication No. 6-250895. However, in the apparatus, since information being managed is machine dependent, it cannot freely exchange history information of an SGML document with other applications.