1. Field of the Invention
This invention relates generally to mapping structured information to different structured information. The present invention relates more specifically to processing a document encoded in a markup language format, a database information format, an ISO/IEC 9070 naming scheme, a UNIX file name scheme, or a DOS file name scheme, transforming it into another markup language format, another database information format, an ISO/IEC 9070 naming scheme, a UNIX file name scheme, or a DOS file name scheme. The invention is more specifically related to a method and apparatus for mapping in which a user interactively defines the mapping for the transformation.
2. Discussion of the Background
Standard Generalized Markup Language ("SGML") is an information management standard adopted by the International Organization for Standardization ("ISO"), as ISO 8879:1986, as a means for providing platform-independent and application-independent documents that retain content, indexing, and linked information. SGML provides a grammarlike mechanism for users to define the structure of their documents and the tags they will use to denote the structure in individual documents. A complete description of SGML is provided in Goldfarb, C. F., The SGML Handbook, Oxford University Press, Oxford, 1990, and McGrath, S., Parseme.1st: SGML for Software Developers, Prentice Hall PTR, New Jersey, 1998, which are incorporated herein by reference.
HyperText Markup Language ("HTML") is an application of SGML that uses tags to mark elements, such as text or graphics, in a document to indicate how Web browsers should display these elements to the user and should respond to user actions such as activation of a link by means of a key press or mouse click. HTML is used for documents on the World Wide Web. HTML 2.0, defined by the Internet Engineering Task Force ("IETF"), includes features of HTML common to all Web browsers as of 1995, and was the first version of HTML widely used on the World Wide Web. Future HTML development will be carried out by the World Wide Web Consortium ("W3C"). HTML 3.2, the latest proposed standard, incorporates features widely implemented as of early 1996. A description of SGML and HTML features is given in Bradley, N., The Concise &lt;SGML&gt; Companion, Addison Wesley Longman, New York, 1997, which is incorporated herein by reference.
A markup language generally is a set of codes in a text file that instruct a computer how to format non-code text, graphics, or other types of files on a printer, video display, or other output device, or how to index and link its contents. From a historical perspective, the word "markup" is carried over from the setting in which a scrivener creates a document in handwritten form and hands it off to a secretary for processing into typewritten text. In creating and proofreading documents to get them into their final form, the scrivener or human word processor makes annotations or marks on intermediate documents to indicate operations such as converting upper case to lower case or vice versa, centering text, requesting text to be in bold format or italic format, and requesting a new paragraph with a symbol to denote the new paragraph format.
Markup languages typically provide symbolic means for accomplishing effects similar to those described above by insertion of tags in the text. For example, in HTML, a `&lt;P&gt;` inserted in a string of text denotes a new paragraph to be formatted. End tags typically turn off a particular requested format. In HTML, a `&lt;/P&gt;` denotes the end of the current paragraph. A `&lt;B&gt;` denotes turning on a bold format for the text that follows. A `&lt;/B&gt;` denotes turning off a bold format for the text that follows. Markup languages generally are designed to enable documents and other files to be platform-independent and highly portable between applications.
ISO and the International Electrotechnical Commission ("IEC") form a specialized system for worldwide standardization. ISO/IEC 9070:1991(E) is an international standard which is applied to an assignment of unique owner prefixes to owners of public text conforming to ISO 8879. The standard describes the procedures for making an assignment and the method for constructing registered owner names from them. Procedures for self-assignment of owner prefixes by standards bodies and other organizations are also specified. ISO/IEC 9070:1991(E) is incorporated herein by reference.
UNIX and DOS are well-known operating systems for computers. Both UNIX and DOS support a file naming scheme which involve a path from a root directory, through descendant directories, to leaf nodes which are non-directory file names.
Processing systems are known in which a data processor converts a document encoded in a markup language automatically to another format. For example, Balise software from Computing Art, Inc. processes documents encoded in SGML to convert them to a formatted output for user viewing. However, this software does not allow the user to interactively define the mapping of SGML tags to another format.