1. Field of the Invention
This invention relates to the generation of object-oriented world wide web pages
2. Background
The Internet is a worldwide matrix of interconnected computers. An Internet client accesses a computer on the network via an Internet provider. An Internet provider is an organization that provides a client (eg., an individual or other organization) with access to the Internet (via analog telephone line or Integrated Services Digital Network line, for example). A client can, for example, download a file from or send an electronic mail message to another computer/client using the Internet.
To retrieve a file on the Internet, a client must search for the file, make a connection to the computer on which the file is stored, and download the file. Each of these steps may involve a separate application and access to multiple, dissimilar computer systems. The World Wide Web (WWW) was developed to provide a simpler, more uniform means for accessing information on the Internet.
The components of the WWW include browser software, network links, and servers. The browser software, or browser, is a user-friendly interface (i.e., front-end) that simplifies access to the Internet. A browser allows a client to communicate a request without having to learn a complicated command syntax, for example. A browser typically provides a graphical user interface (GUI) for displaying information and receiving input. Examples of browsers currently available include Mosaic, Netscape, and Cello.
Information servers maintain the information on the WWW and are capable of processing a client request. Hypertext Transport Protocol (HTTP) is the standard protocol for communication with an information server on the WWW. HTTP has communication methods that allow clients to request data form a server and send information to the server.
To submit a request, the client contacts the HTTP server and transmits the request to the HTTP server. The request contains the communication method requested for the transaction (e.g., GET an object from the server or POST data to an object on the server). The HTTP server responds to the client by sending a status of the request and the requested information. The connection is then terminated between the client and the HTTP server.
A client request therefore, consists of establishing a connection between the client and the HTTP server, performing the request, and terminating the connection. The HTTP server does not retain any information about the request after the connection has been terminated. HTTP is, therefore, a stateless protocol. That is, a client can make several requests of an HTTP server, but each individual request is treated independent of any other request. The server has no recollection of any previous request.
An addressing scheme is employed to identify Internet resources (e.g., HTTP server, file or program). This addressing scheme is called Uniform Resource Locator (URL). A URL contains the protocol to use when accessing the server (e.g., HTTP), the Internet domain name of the site on which the server is running, the port number of the server, and the location of the resource in the file structure of the server.
The WWW uses a concept known as hypertext. Hypertext provides the ability to create links within a document to move directly to other information. To activate the link, it is only necessary to click on the hypertext link (e.g., a word or phrase). The hypertext link can be to information stored on a different site than the one that supplied the current information. A URL is associated with the link to identify the location of the additional information. When the link is activated, the client""s browser uses the link to access the data at the site specified in the URL.
If the client request is for a file, the HTTP server locates the file and sends it to the client. An HTTP server also has the ability to delegate work to gateway programs. The Common Gateway Interface (CGI) specification defines the mechanisms by which HTTP servers communicate with gateway programs. A gateway program is referenced using a URL. The HTTP server activates the program specified in the URL and uses CGI mechanisms to pass program data sent by the client to the gateway program. Data is passed from the server to the gateway program via command-line arguments, standard input, or environment variables. The gateway program processes the data and returns its response to the server using CGI (via standard input, for example). The server forwards the data to the client using the HTTP.
A browser displays information to a client/user as pages or documents. A language is used to define the format for a page to be displayed in the WWW. The language is called Hypertext Markup Language (HTML). A WWW page is transmitted to a client as an HTML document. The browser executing at the client parses the document and produces a displays a page based on the information in the HTML document.
HTML is a structural language that is comprised of HTML elements that are nested within each other. An HTML document is a text file in which certain strings of characters, called tags, mark regions of the document and assign special meaning to them. These regions are called HTML elements. Each element has a name, or tag. An element can have attributes that specify properties of the element. Blocks or components include unordered list, text boxes, check boxes, radio buttons, for example. Each block has properties such as name, type, and value. The following provides an example of the structure of an HTML document:
Each HTML element is delimited by the pair of characters xe2x80x9c less than xe2x80x9d and xe2x80x9c greater than xe2x80x9d The name of the HTML element is contained within the delimiting characters. The combination of the name and delimiting characters is referred to as a marker, or tag. Each element is identified by its marker. In most cases, each element has a start and ending marker. The ending marker is identified by the inclusion of an another character, xe2x80x9c/xe2x80x9d that follows the xe2x80x9c less than xe2x80x9d character.
HTML is a hierarchical language. With the exception of the HTML element, all other elements are contained within another element. The HTML element encompasses the entire document. It identifies the enclosed text as an HTML document. The HEAD element is contained within the HTML element and includes information about the HTML document. The BODY element is contained within the HTML. The BODY element contains all of the text and other information to be displayed. Other HTML elements are described in an HTML reference manual.
The prior art HTML is not object-oriented. An HTML element is contained as a string within a flat, ASCII file. An application must be written to manipulate an HTML file. It would be beneficial to have the ability to map HTML elements to classes of objects that define the behavior of HTML elements.
In the present invention, HTML elements are mapped to objects in an object-oriented environment. Classes of objects are defined for each HTML element as well as the HTML document (or page). By providing a one-to-one mapping between each HTML element and object classes, HTML documents can be manipulated programmatically. The properties of each element are stored in instance variables of the associated object. Each object class can include methods to manipulate the HTML element within an HTML document.
An HTML document defines a World Wide Web page. An HTML document can be generated using HTML templates. Multiple HTML templates can be used to generate a single HTML document. An HTML template consists of HTML element statements. A parser parses an HTML template and generates an object tree.
The object tree is traversed during HTML document generation. When the HTML document is rendered, or generated, the root of the object tree is sent a message to create the document. The root object processes this message by creating its corresponding HTML element statement(s). The xe2x80x9ccreatexe2x80x9d message is then forwarded by the parent object to its children. This process is repeated until all of the objects in the object tree receive the message. Once all of the objects process the xe2x80x9ccreatexe2x80x9d message, the HTML document is generated.
An HTML document can also be generated without using an HTML template. An object tree can be created dynamically during runtime. For example, a root object (e.g., a PAGE object) is instantiated at runtime. During processing, additional HTML objects can be instantiated at inserted into the object tree. For example, a BODY object is inserted in the object tree as a child of the root object based on logic contained in an application procedure. Attributes of the BODY object can be set during application processing. Additional HTML objects can be added to the object tree in a similar manner until all of the objects have been assembled. As previously described, the HTML document is generated by sending a xe2x80x9ccreatexe2x80x9d message to the objects in the object tree.
An HTML template can include a server-side HTML extension known as the group extension. A group extension provides the ability to create a block of HTML statements. A name attribute of the group extension provides the ability to identify the group. The name attribute can be stored in a hash table. During processing, the group identity can be retrieved and used to traverse an object tree.
The group extension also ensures the scoping of named elements A group object maintains a hash table that includes the named elements which are its members. The group element scopes the named elements within itself. Therefore, two elements having the same name in different groups are distinguishable. Each one is scoped to its respective group.
A declarations file is used in combination with the group extension. The declarations file contains additional definition for a group extension. An entry in the declarations file includes a label that links the entry to the group extension. The entry also declares an HTML object. That is, the class of the HTML object is defined. Properties of the HTML object can also be defined within the entry. Values for properties provided in the object class definition can be used to populate the properties in an instance of the object class instantiated for the group.
A declaration entry modifies its associated group by adding element to the group or modifying the elements that already exist in the group. For example, an HTML object, or element, declared in the entry inserts itself into the group that bears the same name as the declaration entry. Property values that are declared in the declaration entry are used to modify the HTML object""s properties.
An instance of the group extension contained in an HTML template is not included in the HTML document that is sent to the client browser. Therefore, the client browser does not need to recognize the group extension. When the HTML document is rendered, the HTML objects contained within a group object render themselves to generate the HTML for the group. The HTML objects generate actual HTML statements within the HTML document. That is, the group does not generate HTML statements. However, the group transmits a message to its children (e.g., an HTML object that inserted itself within the group) to generate HTML statements.
A declaration entry may contain, for example, a declaration for a text string HTML object. The STRING HTML object is inserted into the group identified by the entry. When the HTML document is rendered, the STRING HTML object generates the HTML statements necessary to insert the text string into the HTML document.
A group extension also provides the ability to identify a block of HTML as a repeating subcomponent of the HTML document. The block identified by the group extension can be repeated multiple times within the HTML document. A repeating group can be used to render HTML statements that contain data that is retrieved from an external data source, for example.