Documenting software products and other technical subjects has always been difficult. The technical writer must learn the subject and write a document that helps the reader understand the subject, find information quickly, and troubleshoot problems. The more complex the subject, the more difficult the writer's task. Each significant advance in modem technology brings a new level of complexity and challenge for the writer of technical documentation.
A Common Problem: Documenting Multiple Variations of a Subject PA0 Existing Document Generation Systems and/heir Limitations PA0 SGML: A Standard That Promotes Reuse PA0 SGML and Object-Oriented Information Management
For example, consider the case of a set of user manuals for a software product. When a new product is first released, it typically operates on only one hardware platform, such as a personal computer (PC), a workstation or a specific mainframe system. One set of user manuals is created for the product. During the software product life cycle, however, the product will generally become increasingly sophisticated and will be upgraded to run on additional hardware platforms. Now the documentation must not only track the increased sophistication but also adapt to variations in operation introduced by the new hardware platforms.
In the past, technical writers have used one of two approaches to handle variations (such as multiple hardware platforms) in a product offering. In the first approach, each variation is documented separately. That is, the writers will write documentation for the initial product and then modify or rewrite that document for each product variation. The result is a sequence of technical documents having a common ancestor, but no more. Changes to one document are not necessarily reflected in another document. The separate variations of the documents quickly become inconsistent and hard to manage. The redundant documents are costly for the companies to produce and maintain and are less usable for customers who want to compare related products.
In the second approach, all variations are documented together in a single document, with variation-specific information intermixed with common information. For customers who want information about only one variation, such an approach can be confusing. They may, for instance, find it difficult to separate the information relating to their specific variation from information relating to the other variations.
This problem of documenting multiple variations of information is a common one. For software documentation, other examples include products that implement multiple variations of a standard or run on multiple graphical user interfaces. And the problem is not limited to software documentation. For example, a piece of equipment, such as a television, lawn mower, or kitchen appliance, often comes in several models, each with its own variation of instructions. A course workbook might be developed for a subject that will be taught to novice, intermediate, or advanced students, each with variations of similar information. Even a cookbook might include recipes with variations for low-cholesterol, low-sodium, or diabetic diets.
To produce documentation, in both printed and electronic forms, technical writers use document generation systems. Most document generation systems today use a form of procedural (or physical) markup within the source data files. Procedural markup specifies the presentation of a particular region of text in an output document, such as the use of bold or italics for emphasis. This type of markup is specific to one particular output format.
Procedural markup has an inherent limitation: It inhibits the reuse of text. Because it specifics formatting characteristics, such as bold or shading, procedural markup docs not facilitate reuse of text for different output media, For example, the formatting command that specifies a shaded region of text for printed output has no corresponding presentation technique for on-screen presentation. Or a table structure with horizontal and vertical boundaries may need to be displayed differently on the screen than it is on paper. And font styles and sizes might be different for paper and on-screen display.
Another reason procedural markup inhibits reuse is that it is often system-dependent. The formatting codes used for one document generation system are generally incompatible with other document generation systems. When writers using different document generation systems need to share text, they often must convert the files from the format of the sender's document generation system to a plain text, or ASCII format, which removes all procedural markup codes. Then the receiver must insert the procedural markup codes of his/her document generation system into the ASCII file. This is obviously labor-intensive, time-consuming, and costly.
Another limitation of existing document generation systems is their inability to encapsulate information about text. For example, a given paragraph or table might apply to a given variation (such as hardware platform), but there is no inherent way to attach this information to the text so that variation-specific documents can be generated.
To address the inherent limitations with existing document generation systems, the ISO 8879 Standard Generalized Markup Language (SGML) was published in 1986, and it has become increasingly popular in the industry. SGML is a language for describing the structure and content of a document. It is a structural, not procedural (or physical) markup language. In structural markup languages, the markup used within the source dam files identifies the kind of information stored in each data element (such as heading, paragraph, table), rather than the physical presentation of that element (such as typeface or table format). Therefore, text authored with SGML is highly reusable. The same text can be reused for various output media (such as printed documents and on-screen help text), and it is system-independent so it can be shared by writers using different document generation systems.
To understand SGML, it is helpful to briefly examine the SGML tags and how they are used in a source data file. Each tag is enclosed in greater than and less than symbols (&lt;&gt;); for example, a tag that specifies the beginning of a paragraph might look like &lt;p&gt;. In an SGML source data file, each of the various elements is clearly distinguished with beginning and ending tags. The ending tags are preceded by a forward slash (/) character. Authors can use commercially available software tools to insert the required tags into the data file, or they can code the markup directly using an ASCII editor.
The specific names used for the tags within an SGML data file and the hierarchical relationships between the various elements are based on a set of rules, called a Document Type Definition (DTD). The DTD is written according to a rigorous language defined by the ISO 8879 standard. The rules described by the DTD follow standards that have been defined for a particular type of information data element (e.g., a bullet list must contain more than one bullet, a second level heading must precede a third level heading).
In a typical SGML implementation, the SGML source file and its associated DTD are read into a validation software program. The validation program parses the source file and determines whether the file conforms to the rules defined by the DTD. If any rules are violated, an error is detected. Errors can range from syntax errors, such as misspellings, to missing elements. When validation is successful, all elements needed to make up a document are present in the correct order. The data is then ready for production (formatting), a stage that is handled by software applications that format the SGML data elements for specific output mediums. More information on SGML and on the use of SGML data files and Document Type Definitions can be found in Standard Generalized Markup Language (SGML) International Standard (ISO) 8879, First Edition in SGML (ISO) 8879: 1986/Amendment 1, both published in 1986. Both publications are available from Graphic Communications Association (GCA) Publications and Resources in Alexandria, Va.
The concept of object orientation has become popular in the field of computer programming because it enables reuse of programming code objects. Similarly, the concept of object orientation can be applied to technical writing, where "objects" of documentation (including text, graphics, and other forms) can be reused in many ways for many different documents. This object-oriented information management addresses many of the problems associated with authoring increasingly complex technical documentation.
By definition, SGML provides a mechanism for object-oriented integrated information management. Under an object-oriented information management strategy, individual snippets (or objects) of data become part of an organization's database of information. Documents are then formed by piecing together pertinent objects. SGML provides a mechanism for encapsulating data within information objects. For example, data describing an object's purpose or links to other objects may be encapsulated within an information object.
As a standard language specification, however, SGML defines only the way information is described in the source data files. SGML does not define how to manipulate the data elements or how to generate output documents from the source data files. Although SGML provides the ability to encapsulate data within information objects, it does not define a method or system for manipulating the information objects in order to produce a variety of documents from a single document file.
There is a need in the art for a method of authoring a single source document that contains multiple variations of a subject such that the single source document can be used to generate a variety of documents based on these variations. In addition, there is a need in the art for a system for using that source document to generate a document tailored to each variation or documents containing combinations of variations, where variation-specific information is clearly identified.
Finally, there is a need in the art for a system and method of producing documents that can identify data encapsulated within an information object as specific to a particular variation, that can manipulate the encapsulated data objects, that can generate multiple output documents from a single source file that contains information for multiple variations, and that permits variety in the presentation of the encapsulated variation data in the output documents.