1. Field of the Invention
This invention relates to the field of data source access in a three tier environment.
Portions of the disclosure of this patent document contain material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office file or records, but otherwise reserves all copyright rights whatsoever.
2. Background Art
The Internet, or World Wide Web, is used extensively to access information from a variety of sources. Usually, the internet is accessed on a computer by using a software application known as a "browser". Sometimes it is desired to access information from a database and use that information in an application used with the browser. A disadvantage of the current technology is that it requires advance knowledge of the database and the application so that they may communicate with each other. This problem can be better understood by a review of the structure and operation of the internet.
The Internet
The Internet is a worldwide matrix of interconnected computers. An Internet client accesses a computer on the network via an Internet provider. An Internet provider is an organization that provides a client (e.g., an individual or other organization) with access to the Internet (via analog telephone line or Integrated Services Digital Network line, for example). A client can, for example, read information from, download a file from or send an electronic mail message to another computer/client using the Internet.
To retrieve a file on the Internet, a client must search for the file, make a connection to the computer on which the file is stored, and download the file. Each of these steps may involve a separate application and access to multiple, dissimilar computer systems. The World Wide Web (WWW) was developed to provide a simpler, more uniform means for accessing information on the Internet.
The components of the WWW include browser software, network links, and servers. The browser software, or browser, is a user-friendly interface (i.e., front-end) that simplifies access to the Internet. A browser allows a client to communicate a request without having to learn a complicated command syntax, for example. A browser typically provides a graphical user interface (GUI) for displaying information and receiving input. Examples of browsers currently available include Mosaic, Netscape Navigator and Communicator, Microsoft Internet Explorer, and Cello.
Information servers maintain the information on the WWW and are capable of processing a client request. Hypertext Transport Protocol (HTTP) is the standard protocol for communication with an information server on the WWW. HTTP has communication methods that allow clients to request data form a server and send information to the server.
To submit a request, the client contacts the HTTP server and transmits the request to the HTTP server. The request contains the communication method requested for the transaction (e.g., GET an object from the server or POST data to an object on the server). The HTTP server responds to the client by sending a status of the request and the requested information. The connection is then terminated between the client and the HTTP server.
A client request therefore, consists of establishing a connection between the client and the HTTP server, performing the request, and terminating the connection. The HTTP server does not retain any information about the request after the connection has been terminated. HTTP is, therefore, a stateless protocol. That is, a client can make several requests of an HTTP server, but each individual request is treated independent of any other request. The server has no recollection of any previous request.
An addressing scheme is employed to identify Internet resources (e.g., HTTP server, file or program). This addressing scheme is called Uniform Resource Locator (URL). A URL contains the protocol to use when accessing the server (e.g., HTTP), the Internet domain name of the site on which the server is running, the port number of the server, and the location of the resource in the file structure of the server.
The WWW uses a concept known as hypertext. Hypertext provides the ability to create links within a document to move directly to other information. To activate the link, it is only necessary to click on the hypertext link (e.g., a word or phrase). The hypertext link can be to information stored on a different site than the one that supplied the current information. A URL is associated with the link to identify the location of the additional information. When the link is activated, the client's browser uses the link to access the data at the site specified in the URL.
If the client request is for a file, the HTTP server locates the file and sends it to the client. An HTTP server also has the ability to delegate work to gateway programs. The Common Gateway Interface (CGI) specification defines a mechanism by which HTTP servers communicate with gateway programs. A gateway program is referenced using a URL. The HTTP server activates the program specified in the URL and uses CGI mechanisms to pass program data sent by the client to the gateway program. Data is passed from the server to the gateway program via command-line arguments, standard input, or environment variables. The gateway program processes the data and returns its response to the server using CGI (via standard input, for example). The server forwards the data to the client using the HTTP.
A browser displays information to a client/user as pages or documents (referred to as "web pages" or "web sites"). A language is used to define the format for a page to be displayed in the WWW. The language is called Hypertext Markup Language (HTML). A WWW page is transmitted to a client as an HTML document. The browser executing at the client parses the document and produces a displays a page based on the information in the HTML document.
HTML is a structural language that is comprised of HTML elements that are nested within each other. An HTML document is a text file in which certain strings of characters, called tags, mark regions of the document and assign special meaning to them. These regions are called HTML elements. Each element has a name, or tag. An element can have attributes that specify properties of the element. Blocks or components include unordered list, text boxes, check boxes, radio buttons, for example. Each block has properties such as name, type, and value. The following provides an example of the structure of an HTML document:
&lt;HTML&gt; PA1 &lt;/HTML&gt;
&lt;HEAD&gt; PA2 . . . element(s) valid in the document head PA2 &lt;/HEAD&gt; PA2 &lt;BODY&gt; PA2 . . . element(s) valid in the document body PA2 &lt;/BODY&gt;
Each HTML element is delimited by the pair of characters "&lt;" and "&gt;". The name of the HTML element is contained within the delimiting characters. The combination of the name and delimiting characters is referred to as a marker, or tag. Each element is identified by its marker. In most cases, each element has a start and ending marker. The ending marker is identified by the inclusion of an another character, "/" that follows the "&lt;" character.
HTML is a hierarchical language. With the exception of the HTML element, all other elements are contained within another element. The HTML element encompasses the entire document. It identifies the enclosed text as an HTML document. The HEAD element is contained within the HTML element and includes information about the HTML document. The BODY element is contained within the HTML. The BODY element contains all of the text and other information to be displayed. Other HTML elements are described in HTML reference manuals.
Communicating with Databases
One typical use of computer systems is to retrieve data from a database. FIG. 2 illustrates an early prior art architecture for accomplishing this task known as "client-server" architecture. In this architecture, the client 201 is a computer system with processing capability, and running an application that can process data. However, the client does not store the data that it uses. That data is stored in a central location such as mass storage 204 and is provided to the client 201 by a database server 202. The client 201 communicates to the server 202 over a communication link 203. The client prepares a request for data that is called a database query and sends the query over link 203 to the server 202. The server 202 processes the query, collects the requested data from mass storage 204, and returns the data to the client 201 over the communications link 203.
One disadvantage of the client server architecture is the requirement of a direct communications link to the database server. Another disadvantage is the fact that the requesting application on the client machine is written specifically to interact with the database on the server machine. If the database system changes, or if it is desired to have the client access another database, new program code must be written to handle the change.
Database Access Through Internet
With the increased use of the internet, it was desired to provide access to databases from internet clients. An internet client uses a browser to access the internet via a web server. An example of an architecture for accessing a database through the internet is illustrated in FIG. 3. A client running a browser 301 communicates through the internet 307 to a web server 302. The web server 302 receives requests from the browser and returns information to the browser through the internet. When a browser makes a database request, the web server forwards the request to a process 303 that can communicate with a database. To provide the ability to talk to more than one database, such as databases 305 and 306 of FIG. 3, an interface protocol known as Common Gateway Interface (CGI) 304 was created and used to interface between process 303 and databases 305 and 306.
In operation, the web server 302 receives a request for data. The web server 302 creates a database query that is sent to process 303. Process 303 communicates with the appropriate database through CGI 304 and returns the results to web server 302. Web server generates an HTML page using the returned data and sends the page to the browser 301 through the internet 307. This approach is referred to as a "three tier" architecture. The tiers consist of the client, the interface (web browser) and the database.
There are a number of disadvantages of this prior art three tier architecture. One disadvantage is that a CGI script is required for each database. Writing the CGI scripts is complex and requires knowledge of the internet, the web server, and the database being accessed. Each application is developed independently with little possibility for sharing of resources. Another disadvantage is the lack of interactivity. There is no computation being accomplished on the browser side. For example, in some cases a user may need to complete and submit a request form to retrieve information from a database. If the user enters incorrect information, the form must be submitted through the internet to the web server, processed, and then returned to the user before the user knows that its request is not correct.
Another disadvantage is if the database changes, the CGI process needs to be recoded. This makes it hard to maintain up to date CGI processes. This could be required even with different versions of the original database. In addition, access to the database, that is, permissions to use the database, are handled at the web server. Therefore coding of some permission scheme is required.
Another disadvantage of the prior art CGI approach is that often a business must maintain two sets of applications, one for in house and one for customers. For example, a business will access its own database through a local area network that does not require the use of CGI scripts or processes. But customer access is limited to the web server topology, requiring CGI scripts. Administration of systems in the CGI topology is currently manual, with now automatic control systems. Another disadvantage is the lack of source code control in the prior art.
One prior art approach to provide improved access to a database from a browser is to use browser independent applets. An embodiment of this prior art scheme is illustrated in FIG. 4. FIG. 4 is similar to FIG. 3, with the addition of a Java.TM. Virtual Machine 401 executing on the Browser computer. The virtual machine 401 receives applets (object code) stored on the server when needed for execution on the browser via the virtual machine. One example of an applet is applet 402 of FIG. 4.
In the example of FIG. 4, the applet is an executable application that performs database accesses directly to the database. The applet may be a form, for example, that a user completes in order to make a database request. The applet may have test to determine if the form is completed properly prior to the form being submitted to the database. This provides more efficient operation than the prior art scheme of sending the form data to the web server for validation.
A problem with the topology of FIG. 4 is that the web servers of the prior art were not aware of the Java.TM. programming language. As a result, the applet communicated directly to the database. This produced a client-server topology for database accesses, with all of the attendant disadvantages described above.