1. Field of the Invention
The present invention relates generally to computer networks and client-server software applications and, more specifically, to systems for dynamically generating hypertext forms and receiving and processing such forms containing information filled in by a remote user.
2. Description of the Related Art
A distributed hypermedia computer system may be referred to as a "web." The World Wide Web (hereinafter "the Web" (capitalized)) is a web that uses the Internet to facilitate global hypermedia communication using specified protocols. One such protocol is the Hypertext Transfer Protocol (HTTP), which facilitates communication of hypertext. Hypertext is the combination of information and links to other information. In the context of the Web, hypertext is defined by the Hypertext Mark-up Language (HTML). The links or hyperlinks in a HTML document reference the locations of resources on the Web, such as other HTML documents. The term "hypertext," rather than "hypermedia," persists despite the fact that the information may comprise images, video and sound as well as text.
The Web is a client-server system. The HTML documents are stored on Web server computers, typically in a hierarchical fashion with the root document being referred to as the home page. The client specifies a HTML document or other resource on the server by transmitting a Uniform Resource Locator (URL), which specifies the protocol to use, e.g., HTTP, the path to the server directory in which the resource is located, and filename of the resource. An HTTP daemon (HTTPD) is a program running on the server that handles these requests. Users retrieve the documents via client computers. The software running on the user's client computer that enables the user to view HTML documents on the computer's video monitor and enter selections using the computer's keyboard and mouse is known as a browser. The browser typically includes a window in which the user may type a URL. The browser also typically includes button labeled "Back" that instructs the browser to request the document previously displayed. It may also include a "Forward" button that instructs the browser to re-request a previously displayed document that has been backed over using the "Back" button.
A user may cause a URL to be transmitted by typing it in the designated window on the browser or by maneuvering the cursor to a position on the displayed document that corresponds to a hyperlink to a resource and actuating the mouse button. The latter method is commonly referred to simply as "clicking on the hot-spot" or "clicking on the hyperlink." When the browser transmits a URL to the HTTPD, the HTTPD accesses the specified resource. If that resource is an HTML document, the HTTPD retrieves the document and transmits it back to the browser. If the resource is a program, the HTTPD executes it. In a Web server running the UNIX operating system and either a NCSA or CERN HTTPD, the HTTPD assumes the resource is a program and attempts to execute it if the resource is located in the "cgi-bin" directory. A resource that is an executable program is generally known as a common gateway interface (CGI). On UNIX servers, CGIs are typically UNIX shell scripts or Perl scripts. CGIs written in high level programming languages, such as C or C++, are also known, and are generally preferred over scripts for writing complex CGIs. The CGI may accept input data that the HTTPD received from the browser. The HTTPD provides this input data to the CGI. The CGI may similarly pass output data to the HTTPD for transmission back to the browser, typically in the form of HTML.
The CGI may function as a stand-alone application program, but as its name implies more commonly functions as a gateway to other application programs. If the Web server has multiple application programs, one of the parameters that the CGI receives from the HTTPD identifies the application that the CGI is to select. If an application program is added or removed from the server, the CGI must be revised. If an application program is moved to another directory or the directory hierarchy is revised, the CGI must also be revised. Moreover, because each application program may expect data differing in type, number, range, etc., from other application programs, the CGI may include a specialized interface for each application program and unwieldy branching structures. If the application is revised to accept different or additional data, the CGI must also be revised. These maintainability problems increase with an increase in the number of application programs that use the CGI.
A HTML document may be a form that a user may enter information into and submit back to the server (or to a different server). HTML provides means for creating buttons that submit the form, boxes into which a user may type free-form text, boxes having options among which a user may select, boxes in which a user can toggle a checkmark on and off by clicking on it, and other types of input boxes. The browser associates the information ("value") that a user enters with a variable name ("name") specified in the HTML code. The CGI receives the filled-in HTML form from the HTTPD and parses it into "name=value" pairs. An application program running on a server may obtain its input information from the CGI in the form of such name=value pairs.
Web server application software is known that enables a user to shop for and order merchandise. Such systems are sometimes referred to as electronic merchandising systems or virtual storefronts. Systems that enable a user to choose among several retailers' goods are sometimes referred to as electronic malls. An electronic retailer's or electronic mall operator's Web server provides HTML forms that include images and descriptions of merchandise. The user may search for an item by entering a key word search query in a box on a form. When a user selects an item, the server may provide a linked form that describes that item in further detail. The user may enter ordering information into boxes on the form, such as the type and quantity of the item desired. The information entered by the user is transmitted to the server. The user may select multiple items in this manner and then enter a credit card number to pay for the purchases. The retailer processes the transaction and ships the order to the customer.
In the above-described merchandising system, the Web server enters the order information into a database. Similarly, the server merchandising application program may retrieve the product information from a database. Retailers typically maintain their existing customer and product information using commercially available database management software. Many companies produce database application software, but the formats in which the various companies' database management software stores the data are not compatible. In other words, data may need to be reformatted in order to transfer the data from one database management program to another. Nevertheless, server merchandising software is typically oriented toward a particular company's database management program and does not operate with database management programs produced by other companies. Consequently, if a retailer desires to implement a merchandising system on the Web, the retailer may need to convert its existing customer and product database information to a format that is compatible with the server merchandising software.
Furthermore, the server merchandising software is an application program. Although merchandising software written in PERL, a script-like, interpreted language, is known, merchandising software is more typically written in a high-level language and compiled into object code compatible with a particular server platform, i.e., type of computer. Such programs are not portable from one server platform to another. Not only must a software producer compile the merchandising software for each server platform on which it is intended to run, but the software producer must re-compile the software for each server platform each time the software producer revises the software.
Searching for items that the user is interested in purchasing is inefficient in prior merchandising systems. Database management programs use index searching to facilitate rapid searching of large amounts of data. The creator of the database may instruct the program to use specified fields in the database as indexed or key fields. The program locates all terms in the database that appear in the indexed fields and stores them in an index table. Each entry in the index table includes a term and corresponding pointer to the location in the database where the term is found. If a user initiates a search for a term that is present in the index table, the program can locate the instances of that term in the database with exceptional speed. Users who are familiar with the particular database they are searching will generally know which fields are indexed and will know the format of the data in those fields. For example, a user of a database containing the inventory of a bookstore may know that users can search for the names of authors of books and that a user who wishes to do so should enter the author's last name first. A user having such knowledge will therefore be able to search efficiently. Users of electronic merchandising systems, however, are generally end-consumers who have no knowledge of a merchant's database. If, as is very likely, such a user initiates a search for a term that is not present in the index table, the program must sequentially search through all records in the database. Sequential records are typically linked by pointers. Using pointers in this manner is very demanding on server resources, resulting not only in an exceptionally slow search, but also creating a bottleneck for other processes that the server may be executing.
UNIX, a well-known operating system, is the most common operating system for Web servers. UNIX conceptually comprises two elements: a shell and a UNIX kernel. The shell is a program that provides an interface between the user and the kernel. The shell interprets commands the user enters and returns the results of commands to the output device. UNIX includes a rich set of file handling and searching commands. A shell script is a sequence of UNIX commands that form a program that UNIX can execute. Shell scripts are convenient to use, but they run much slower than a functionally equivalent compiled program. UNIX is also a multitasking system, i.e., users can run more than one program simultaneously.
It would be desirable to provide a highly maintainable CGI that efficiently interfaces with multiple web server application programs. It would further be desirable to provide a search engine for a web server application programs, such as electronic merchandising systems, that uses a universally accepted data format, that is highly portable, and that can be flexibly searched. These problems and deficiencies are clearly felt in the art and are solved by the present invention in the manner described below.