A World Wide Web (WWW) site typically consists of a collection of HyperText Markup Language (HTML) documents. HTML is a text language that provides for hyper-linked graphic display. A user of a Web browser utilizing the World Wide Web (WWW) typically requests that a Web server download HTML text to his Web browser. Currently, some popular Web browsers are Netscape Navigator and Microsoft Internet Explorer. The Web browser interprets the downloaded HTML text and generates screen images from the HTML text. The HTML text invariably describes hyper-linked hot spots that cause further downloads when selected.
HTML documents are typically generated by text editors such as Microsoft Word, or by more specialized HTML document editors and are stored as text files in directories on Web servers. By convention, HTML text files have file names that include extensions of either ".HTML" or ".HTM", depending on the operating system. Web servers recognize these file extensions and treat such documents as static byte streams. Static HTML files are typically transmitted verbatim by Web servers to Web browsers, and are thus fairly efficient for the Web servers to process.
One recent application that has gained popularity on the World Wide Web (WWW) is database access. HTML provides an efficient, flexible method of providing a sophisticated user interface to databases. Many WWW access requests are ultimately turned into database accesses. Part of the flexibility of using HTML for this type of application is that user interface changes tend to be fairly easy and do not require the significant programming resources that were required by earlier generations of database interfaces.
To support dynamic or customized Web content, modern commercial Web servers typically recognize additional file types. For example, Microsoft programs recognize files with the extension of ".ASP" indicating that the file is an "Active Server Page" containing various fields that must be analyzed by a Web server. In some instances, these various active or dynamic fields are embedded database requests.
A number of performance problems are introduced into Web servers by supporting active, or dynamic Web documents. One such problem that frequently arises is that having a Web server interpret each HTML command in a dynamic HTML file typically requires a significant amount of computer resources. The present solution to this problem is to replicate the server and database a sufficient number of times necessary to provide required levels of service. Currently, some applications are implemented with databases and database servers replicated upwards of thirty times, with access to the replicated servers provided by sophisticated high speed load leveling routers. While this does work to some extent on databases that are not heavily updated, this approach tends to not work well when the databases need frequent updates. This is because the updates to all of the replicated copies of the database need to be synchronized, which is quite difficult.
The approach works reasonably well for fairly small databases, since the amount of data that needs to be replicated for each database server is fairly small. However, this approach does not scale well. In particular, this approach is currently totally infeasible for enterprise level databases consisting of terabytes of data. One reason for this infeasibility is that large companies often have a hard enough time keeping online access to a single copy of their enterprise level database, given the size of these databases. Replicating the database even a couple of times is not feasible. To this should be added the problems for concurrency problems between database copies that arise any time there are multiple database copies in use and the difficulty of supporting online updates for replicated data.
Another problem that arises is that transmitting entire screens back and forth between Web servers and Web browsers can be expensive in terms of resources, such as processor usage and communications bandwidth. Higher and higher speeds are promised for the Internet and the World Wide Web (WWW). For example, the Regional Bell Operating Companies (RBOCs) are rolling out DSL/ADSL lines capable of megabit transmission rates. Meanwhile, the cable companies are starting to support Internet traffic over their cable systems. However, it seems that the amount of data being transmitted over these communications links is growing at even a higher rate. Also, while end-user speeds are rapidly increasing, backbone speeds are not keeping pace.
One additional problem encountered in high-speed transaction systems utilized as Web browsers is that the HTML hyperlinks typically specify names of files containing HTML documents. These file names are either relative to the current document, or are specified as full UNIX, Window, or Mac type path names. Following these path names works adequately on low volume Web servers and Web browsers. However, it is not uncommon to find in Web servers that more resources are spent opening and closing files than are spent actually interpreting and transmitting the contents of the files.
As databases accessible from the World Wide Web (WWW) increase in size, performance issues become more and more important. There is therefore a need to provide an efficient mechanism for Web servers to process HTML files containing active or dynamic HTML commands. There is also a need to minimize processor cycles and communications bandwidth when transmitting and receiving Web pages. There is also a need to minimize the system overhead generated by opening and closing files in order to evaluate file names in HTTP addresses.