A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the xerographic reproduction by anyone of the patent document or the patent disclosure in exactly the form it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates to the field of document processing. More specifically, one embodiment of the invention provides for an improved method and apparatus for processing documents, text and programs in a uniform and easily maintained manner.
With the growth of the Internet, the global network interconnecting many smaller computer networks and individual computers, many software development and distribution schemes take advantage of the fact that many people are connected together. For example, using programs written in the Java(trademark) programming language developed by Sun Microsystems of Mountain View, Calif., U.S.A., a software provider can place one copy of a program on an Internet server and many Internet clients can run that program as if it were present on the client machine.
As used herein, an xe2x80x9cInternet serverxe2x80x9d refers to a computer, or a collection of computers, which is connected to the Internet and responds to requests directed at the Internet server. An xe2x80x9cInternet clientxe2x80x9d is a computer, or collection of computers, which is connected to the Internet and sends requests to an Internet server. In some cases, one computer or collection of computers may act as a client for one set of requests and as a server for another set of requests.
Several commonly used protocols exist for handling requests and responses to those requests depending on the nature of the request. For example, the File Transfer Protocol (FTP) is a protocol used by a client to request a file from a server. HyperText Transport Protocol (HTTP) is a protocol used by a client to request a hypertext document and used by a server to return requested documents as well as to transport server-initiated objects. Collectively, hypertext documents linked to other hypertext documents, when viewed using an HTTP browser, have been referred to as the xe2x80x9cWorld Wide Webxe2x80x9d or the xe2x80x9cWebxe2x80x9d. These protocols typically operate on top of a lower level protocol known as the Transport Control Protocol/Internet Protocol (TCP/IP). Each of these protocols is well documented in existing literature and on the Internet, so they need not be described here in further detail.
The HTTP protocol has evolved from a protocol for transporting static, pre-existing hypertext documents to a protocol which allows for servers to generate hypertext documents on-the-fly based on the nature and parameters of the client""s request, session xe2x80x9cstatexe2x80x9d maintained by the server for that particular client, and many other varied factors. For example, instead of a request being directed to a static, pre-existing hypertext page stored on a server, the request could be directed to a script, such as a Common Gateway Interface (CGI) script. With such a script, a client sends the server a request that could specify either a static document or a script, but the server determines that the request is directed to a script and responds by executing the script and returning the output of the script as the request result.
FIG. 1 illustrates how such a scripting system might operate. FIG. 1 shows a browser 12 and a server 14, with server 14 on which a server supervisor 20 is executed. Server supervisor 20 handles I/O with browser 12 (which is a client in this example) and has access to one or more forms 22, CGI scripts such as script 24 and stores the output 26 of script 24 for transport to browser 12. Although not shown, it should be understood that a network or the Internet might be interposed between browser 12 and server 14.
FIG. 1 also shows the details of one such form 22 and script 24. In operation, browser 12 makes a request to server 14 a reference that is interpreted by server 14 to be a request for form 22. As shown, form 22 is a form for requesting a name and phone number from the user of browser 12. Form 22 is sent to browser 12, which presents the user with a suitable form to be filled out. Browser 12 presents form 22 according to the instructions contained in form 22. In this example, these instructions are in the form of HTML (HyperText Markup Language, a subset of the Standard Generalized Markup Language, or xe2x80x9cSGMLxe2x80x9d) tagged text.
In response to submission of the filled-out form, server 14 presents the filled-out form to script 24, which in this example is called xe2x80x9cphone.cgixe2x80x9d and is referenced in form 22. Script 24 is written in a scripting language known as PERL. The output 26 of script 24 for a given form input, which can be determined with an understanding of PERL, is xe2x80x9cThank you. Your entry was:xe2x80x9d followed by the name and phone number entered. The script also adds the entry to a file called xe2x80x9cphonebook.txtxe2x80x9d.
One problem with this approach is that two different skill sets, and often two different sets of product developers, are needed to coordinate form development and script development. The forms developers, who may be technical writers familiar with HTML, need to coordinate with programmers writing PERL code so that the variable names and fields in the form match up with variable names and inputs in the script. Coordination is also needed for other languages, such as Java or C.
From the above it is seen that an improved method and apparatus which integrates documents and behavior (programs) associated with those documents is needed.
An improved document processing system is provided by virtue of the present invention, wherein documents and processing associated with those documents are combined by structuring documents according to a common structure applicable to both the documents themselves and the processes that are applied to the documents.
In one embodiment of a client-server document processing system in which the present invention is implemented, an agency is interposed between clients and servers, wherein the agency operates one or more agents which operate on documents which pass between the client and server. Each agent is a set of active documents, where an active document is a structured document containing text and/or behavior. An agent can also be thought of as a software object with behaviors specified by active documents.
The active documents operate on a network in a context of strings, streams and parse trees, which allows programs to be embedded in documents, and since the documents are structured, the programs have the same syntax as documents. Furthermore, since documents are structured, their elements can be used as data structures.
Applications of the document processing system include network office appliances over the Web using standard protocols and software agencies which combine client, server and proxy functions.
One advantage of active documents and the agency system is that client and server-specific software need only address low-level functions, as higher level functions can be implemented as active documents. With the active document language being the mode of development, document-oriented computing can be implemented easily, as a unified language is used to specify document content (data) and document processing (behavior).
Another advantage of an active documents-based document processing system is that the agents themselves are representable as active documents.