1. Field of the Invention
This invention relates to the field of software compilers, and more particularly relates to a method of generating files of information from one or more source files.
2. Description of the Related Art
In March, 1989, the European Laboratory for Particle Physics or CERN (Conseil Europeen pour la Recherche Nucleaire) developed the World-Wide-Web (WWW, or simply, "the web"), an Internet-based computer network that allows users on one computer to access information stored on other computers through a world-wide network. With an intuitive user-interface, known as a web browser, the web rapidly became a popular way of transmitting and accessing text and binary information. Since then, there has been a massive expansion in the number of World-Wide-Web sites, and the amount of information placed on the web.
Information, in the form of electronic files, documents, images, sounds and other formats, forms the basis of internet and web content, and the key to creating a useful and meaningful web-site.
To place information on the web, the information must be stored in a binary or text format in a "file." Binary documents are saved in known formats that depend upon the information being stored. For example, two-dimensional pictures are often stored in "Joint Photographic Experts Group" (JPEG) or "Graphical Image Format" (GIF) standard formats. Audio files and moving images have other formats as well, such as "WAV," "MOV," and "MPEG." For text documents, documents are stored in a HyperText Markup Language (HTML) format. The HTML format dictates the appearance and structure of a web text document, also referred to as a "web page."
Although these formats are required to create compatibility for web browsers, modifying web sites and updating information in these rigid formats is difficult and time consuming. For example, suppose every web page had a copyright notice on it. To update the copyright notice on every page, a web-site administrator would have to either change every page by hand, or use a method of global-search-and-replace. However, because of the non-uniform manner of some web-sites, a global-search-and-replace may not work. More complicated web page changes, such as modifying small applications, known as "applets," are even more difficult. It would be much better if there was a single location or file that could be updated, and the change would be propagated to the entire web-site, or just the appropriate web pages. Very simply put, the problem of maintaining and generating large amounts of data, in any format, is difficult and highly time consuming.
Several solutions have been proposed, each has its problems.
Some web developers choose to generate web pages through a "what you see is what you get" (WYSIWYG) web-page editor. Such editors assemble web pages through a graphical interface, which makes designing pages simpler, but the results are limited because it does not solve the need to maintain the information. Using the above example, to update the copyright notice on every page, a web-site administrator would still have to edit the web pages individually, or the web-page editor program may use a method of global-search-and-replace.
Alternatively, simple pre-processor programs have been used to assemble HTML files. Such pre-processors allow web-page designers to pre-process documents and insert listed documents into a master document. For example, to include another listed document file called "foo.doc" into the master document, a web-page designer could type:
#include "foo.doc"
and the listed document would be included. While this allows fragments of common HTML code to be inserted into documents, as a web-site grows, and more pages are added to the site, the maintenance of such a system quickly becomes a logistical nightmare. Also, the fragments cannot be redefined at the point that they are included in a document. Moreover, such a system is limited strictly to text-based documents, and cannot handle binary forms of information.
U.S. Pat. No. 5,181,162, issued Jan. 19, 1993 to Smith et al. entitled "Document management and production system," discloses a system of decomposing documents into logical components, which are stored as discrete "objects" in an object-oriented computational environment. The system relies on queries to a relational database which occur every time the document is printed, displayed electronically, or electronically transmitted. For a web site, which may transmit pages thousands of times per minute, this solution is a burden on the web server's computing resources. Consequently, the system would be slow, and of limited usefulness to such a high-demand environment. Similarly, the use of a relational database to deliver pages of information on client machines has been attempted; while this provides dynamic construction of documents when they are delivered to the client machines, this solution also burdens the server's computing resources because page information would be constantly regenerated. Although caching generated pages may solve some of the computing resource problems, it creates a new problem because cached pages may be outdated.
Several related patents, U.S. Pat. No. 5,668,999, which issued Sep. 16, 1997 to Gosling ("System and method for preverification of stack usage in bytecode program loops,"), U.S. Pat. No. 5,692,047, issued to McManis ("System and method for executing verifiable programs with facility for using non-verifiable programs from trusted sources,") and U.S. Pat. No. 5,706,502, issued to Foley et al. ("Internet-enabled portfolio manager system and method,"), also fail to solve the problem. Collectively, these patents disclose a method and system of verifying the integrity of computer programs written in a bytecode language to run applications remotely on a client workstation. While this solution may create dynamic client-machine applications, it does not solve the problem of maintaining information in a system.
What is needed is a more flexible way of handling both binary and text information that can produce files of different file formats and still be easy to maintain.
The invention, a Text Object Compiler method, allows users to abstract information, and produce information in virtually any file format.
Almost every contemporary computer is a register-based Von Neuman computer that responds to a machine language. These machine languages include instructions which operate on the contents of registers. Originally, computer software instructions were organized in terms of machine language operations. As computers became more complex, programming in machine language became difficult and increasingly cumbersome. Consequently, computer scientists abstracted machine instructions, creating higher-level languages, known as source languages, structured in terms of expressions and procedures. As software evolved, two strategies for converting source languages into instructions in machine language developed, interpreters and compilers.
An interpreter, written in the native machine language, configures the computer to execute programs written in one of the source languages. The primitive operators or commands of the source language are implemented as a library of subroutines written in the native machine language of the given machine. Interpreters read the source language, one line at a time, and then perform the specified operation. A program to be interpreted, the source program, is represented as a data structure. The interpreter traverses this data structure, analyzing the source program. As it does so, it simulates the intended behavior of the source program by calling appropriate primitive operators from the library.
Instead of analyzing and translating the source program into machine language during execution, it is possible to perform these tasks before execution, enabling more efficient program execution. This alternate method of converting source languages into instructions is called compilation. The program that does the analysis of the source program and reduces the source program to machine language is called a compiler. As shown in FIG. 1, a conventional (i.e., prior art) compiler 2 for a given source language and machine translates computer source code 1 (i.e., a program written in a high level "computer language") into an object code 3, a program written in the computer's native language, referred to in the art as "machine language."
Illustrated by FIG. 2, a conventional compiler 2 is composed of a lexical analyzer 10, a parser 20, and code generator 30. A lexical analyzer 10 takes computer source code 1 and divides the code into lexical tokens. Such lexical tokens can be based on instructions or other keywords in the relevant high level computer language. A parser 20 takes the tokens and groups them together logically based on the relationships established by the source language and the computer source code 1. Lastly, a code generator 30 takes the relationships established by the parser 20 and translates them into an executable computer object code 3 in computer machine language.
Conventional compilers are well known in the prior art, such as U.S. Pat. No. 5,560,015 ("Compiler and method of compilation" issued to Onodera on Sep. 24, 1996), U.S. Pat. No. 5,442,792 ("Expert system compilation method" issued to Chun on Aug. 15, 1995), and U.S. Pat. No. 5,768,592 ("Method and apparatus for managing profile data" issued to Chang on Jun. 16, 1998).
Conventional interpreters and compilers convert high-level computer source code into object code to be executed on a computer. In effect, the interpreter and the compiler allow computer programmers to write computer programs at a higher level of abstraction, and generate object code.