1. Field of the Invention
This invention is directed towards a method and apparatus for converting the format of a document. This invention is more particularly related to a method and apparatus for conversion of a textual encoded document to a binary encoded document. This invention is even more related to the conversion of a clear text encoded Standard Page Description Language document to a binary encoded Standard Page Description Language document. This invention also performs a data compression function by converting a document to a format with a reduced size.
2. Discussion of the Background
A standardized page description language has been proposed and is being developed as an international standard by the International Organization for Standardization ("ISO"). The proposal, to which one of the inventors is a contributor, is currently in draft form before a section of the ISO. The draft is known as ISO/IEC DIS 10180, labeled "INFORMATION PROCESSING-TEXT COMMUNICATION-STANDARD PAGE DESCRIPTION LANGUAGE" and is available at the American National Standards Institute ("ANSI") in New York and incorporated herein by reference.
Standard Page Description Language ("SPDL") is a hierarchically structured page description language. This structured hierarchy allows a portion of a document to be printed without tracing through the entire document for formatting commands which may affect the particular portion being printed. Only the portion of the document which is hierarchically above the portion being printed needs to be processed to print the desired portion.
An additional advantage of SPDL is that it conforms to the Standard Generalized Markup Language ("SGML") as defined in ISO 8879:1986. This allows the structure of documents to be described and tagged in a generic fashion. Once tagged in SGML, files can travel seamlessly from one platform to another without the use of conversion utilities and without the loss of structural formatting.
SPDL conforms to the Basic Encoding Rules set forth in ASN.1. A complete description of ASN.1 can be found in "ASN.1, The Tutorial and Reference," by Douglas Steedman, 1990, which is incorporated herein by reference.
A clear text language is a type of computer language which is human readable. An example of a non-clear text language would be a binary encoding of a document as a human could not readily understand the contents of the document by looking at the binary or hexadecimal representation of the document. A primary advantage of a binary encoded document includes the fact that the binary represented document consumes much less storage space than the equivalent clear text format document. This allows for a smaller storage space of the binary document and a faster transmittal time of the binary document. However, the editing and understanding of a binary encoded document is difficult without the use of special software as compared to a clear text encoded document.
Therefore, as described above there are advantages and disadvantages to both the clear text encoding and binary encoding of a document and it may be desirable to convert a document from one format to another.