1. Field of the Invention
The present invention relates to computer networks and more particularly to a method and apparatus for populating a form with data.
2. Background Art
Computer users can shop for merchandise using the Internet. A number of companies maintain web sites where customers can purchase books, records, videos, and many other products. Other web sites provide subscription services that provide the user with access to a service or product. Typically, users make purchases or sign up for services by completing forms. A problem with current Internet schemes is that each field on a form must be filled out separately and manually. This is a time consuming and often frustrating process for the user.
Current systems do not provide a way for users to quickly complete a form (referred to as “populating” a form) with data. For example, when a user wishes to purchase a product online, the user typically supplies a credit card number, a shipping address, and any other personal information that may be needed to complete the transaction. The user supplies this information using an input device such as a keyboard to manually enter data into the fields of a form. However, after buying goods from one vendor, for example, the user may wish to purchase goods from a second vendor. The user must manually complete the second vendor's form even though the second vendor may require the purchaser to provide the same kind of information the user provided to the first vendor. This presents a problem to the user because it forces the user to manually enter the same information into different vendor's forms multiple times.
Another problem presented by the prior art is that users do not have a way to securely store sensitive information and then later use that information to populate a form. Current systems do not provide a way to prevent unauthorized users from obtaining access to sensitive information stored on a user's computer. Instead such systems leave data that is entered into a form readily accessible. For example, data that is entered into a form may be stored locally using a technology called cookies. A cookie is a local representation of information related to a particular web page that was previously visited by the user. Cookies are not encrypted and may be read by anyone who can obtain them. The term anyone also includes any computer program that knows where the cookies are stored. This is problematic because it unnecessarily exposes sensitive information to unauthorized users. Therefore, there is a need for a method and apparatus for populating forms with data where that data is accessible only to authenticated users.
The problems associated with form population may be better understood by the following discussion of the Internet/World Wide Web, web page creation, embedded forms, cookies, and web browser technology.
The Internet/World Wide Web:
A web browser is a type of computer program that provides users with a mechanism for accessing the World Wide Web (WWW). The WWW is a segment of the Internet comprised of numerous web clients and web servers that communicate with one another using a standard set of protocols. A web server is a computer configured to provide web pages to the web client upon request. A web client typically utilizes the web browser application to request web pages from the web server.
The Internet is a global computer network comprised of an amalgamation of interconnected networks that are capable of readily communicating with one another using a standardized set of protocols. Protocols provide a uniform set of communication parameters that enable computers to effectively transmit and receive data. Most computer networks, including the Internet, utilize several different layers of protocols. Each layer provides the network with different functionality.
The WWW is a segment of the Internet that utilizes an application layer protocol called the HyperText Transfer Protocol (HTTP) to disseminate and to obtain information from users. HTTP is a request/response protocol used with distributed, collaborative, hypermedia information systems. In operation, HTTP enables one computer to communicate with another. For example referring now to FIG. 1, web client 100 can use HTTP to communicate with web server 150 via network 125. In this scenario the web server acts as a repository for files 160 and is capable of processing the web client's requests for such files. The files 160 stored on the web server may contain any type of data. For example, the files may contain data used to construct a form, image data, text data, or any other type of data.
HTTP has communication methods that allow web client 100 to request or send data to web server 150. The web client 100 may use web browser 110 to initiate request 115 and receive one or more files from file repository 160. Typically, web browser 110 sends a request 115 for at least one file to web server 150 and the web server forwards the requested file to web client 100 in response 120. The connection is then terminated between web client 100 and web server 150. Web client 100 then uses web browser 110 to display the requested file. A client request 115 therefore, consists of establishing a connection between the web client and the web server using network 125, issuing a response 120 to the request 115, and terminating the connection.
Web server 150 does not utilize state information about the request once the connection is terminated. HTTP is, therefore, a stateless application protocol. That is, a web client can send several requests to a web server, but each individual request is treated independent of any other request. In some instances, the web server maintains a record of such requests, but the web server does not use that information to process later requests. Thus, for example, if a form is completed by the user and submitted to the web server for processing, the web server may maintain a record of the data entered into the form, but that record will not be used to later influence another request from the same client. In other words, web server 150 may record a request in a log file, but it does not later read from the log to determine how to respond to another request from the same client.
Once a file is sent from web server 150 to web client 100 it becomes ready for display. The web client's 100 web browser 110 is typically used to format and display files. Web browser 110 allows the user to request and view a file without having to learn a complicated command syntax. Examples of several widely used web browsers include Netscape Navigator, Internet Explorer, and Opera. Some web browsers can display several different types of files. For example, files written using the HyperText Markup Language (HTML), the JavaScript programming language, the ActiveX programming language, or the Portable Document Format (PDF) may be displayed using a web browser. It is also possible to display various other types of files using language such as Standard Generalized Markup Language (SGML) or eXtensible Markup Language (XML).
Creating a Web Page:
A form, which provides one or more places for a user to enter data, can be embedded inside of a web page. A web page may be created using a variety of different data formats and/or programming languages. Most web pages, and as a result most forms, are created using the HyperText Markup Language (HTML). The techniques used to create a web page will now be discussed in further detail.
HTML is a language that may be used to specify the contents of a web page (e.g. web page 220). An HTML description is typically comprised of a set of markup symbols which are described in more detail below. HTML file 250 or any type of data file that contains the markup symbols for web page 220 may be sent to web browser 210. Web browser 210 executing at web client 200 parses the markup symbols in HTML file 250 and produces web page 220, which is then displayed, based on the information in HTML file 250. Web page 220 may contain text, pictures, or forms comprised of embedded text fields, checkboxes, or other types of data that is to be displayed on the web client using web browser 210. Consequently, HTML document 250 defines the web page 220 that is rendered by web browser 210. For example, the following set of markup symbols directs web browser 210 to display a title, a heading, and an image called “image.jpg”:
<HTML><HEAD><TITLE> This is a document title </TITLE></HEAD><BODY><H1> This text uses heading level one </H1><IMG SRC= “http://www.idealab.com/image.jpg”></BODY></HTML>
In the above example, markup symbols (e.g. “<” and “>”) indicate where each HTML command (e.g. TITLE) begins and ends. An HTML command, which is typically surround by markup symbols, provides the web browser with instructions to execute. Markup symbols typically surround an HTML command. The “<” symbol indicates the start of an HTML command an the “</” symbol indicates the end of an HTML command. Each start or end command has a corresponding “>” to indicate the close of that particular command. Information associated with the HTML command may be contained within the HTML command's start and end symbols. An HTML command is used by the web browser 210 to determine how to process the block of information associated with the two commands.
In the above example, “<TITLE>”, and “</TITLE>” are examples of HTML commands surrounded by markup symbols. The “</TITLE>” HTML command directs web browser 210 to place the text “This is a document title” in the title bar of web browser 210.
Some HTML commands have attribute names associated with the command. For example, HTML command “<IMG>”, directs web browser 210 to display an image. A “SRC=” attribute identifies the location and name of the image to be displayed. In the above example, the statement “<IMG SRC=“http://www.idealab.com/image.jpg”>” tells the web browser to display an image named “image.jpg” that can be obtained from the web server located at “http://www.idealab.com.”
Embedding a Form into a Web Page
An HTML file may also contain HTML commands that cause the web browser to render a web page that contains fields for entering data. The portion of the web page that contains the data entry fields may be referred to as a form (e.g. the portion enclosed in the HTML tags <FORM></FORM>). When the web page is comprised of primarily data entry fields the entire web page is sometimes also referred to as a form. As is discussed below, HTML includes an HTML form command that may cause the browser to display data entry fields. A text box, a drop down menu, a check box, a command button, a toggle button, or any other kind of interface component capable of receiving input are some examples of the type of data entry fields that may be placed in a form. Data is manually entered into the fields by the user and then submitted to a web server for processing. As previously discussed, when a user wishes to use a form to purchase a product, for example, the user manually fills in text boxes located on the form with the type of information needed to carry out the transaction. Other types of data entry fields require the user to select from one or more options such as in the case of a menu, a toggle button, or a radio button, for example. When the form is complete the user submits the completed form to the vendor's web server by pressing a command button, for example.
A problem with collecting information from the user in this manner is that the user is required to manually enter data into the fields of a form. This takes time and becomes unnecessarily burdensome when the user wishes to complete more than one form with the same information.
FIG. 3 provides an example of, a form created using the HTML definition language. Code block 310 contains HTML command examples. When a document comprising code block 310 is transmitted to web browser 300 executing on web client 330, it causes form 305 to be displayed. Web browser 300 displays form 305 by parsing the HTML commands contained in code block 310 and then using the information obtained to format form 305. Once the user finishes entering information into form 305, the user may submit the form by pressing command button 331. Pressing command button 331 sends the information entered by the user in form 305 to web server 320 using the POST method. However, one of ordinary skill in the art could modify code block 310 to use the GET method instead of the POST method.
The <FORM> command shown in code block 310 indicates the beginning of a form. Once the initial FORM command is placed into the HTML document other HTML commands may be entered between the initial FORM command and the closing FORM command (e.g. </FORM>) that represent, for example, one or more data entry fields such as a text-box, drop-down lists, check boxes, radio buttons, and input buttons. The INPUT command is used to specify different types of data entry fields. The type of data entry field created is dependant upon the value assigned to the INPUT command's TYPE attribute. For example, the INPUT command can create a text-box, a password field, a hidden field, a button, a radio button, or a checkbox. A “text” value assigned to the TYPE attribute of an INPUT command creates a text box, for example. Similarly, a TYPE of “checkbox” creates a checkbox. The following code, if inserted after the FORM element, creates a text box and a checkbox:
<INPUT TYPE=“text” NAME=“user_name”><INPUT TYPE=“checkbox” NAME=“user_item1”>
The HTML tags and text contained in code block 310, for example, create form 305 when displayed using web browser 300.
A <FORM> tag may contain an ACTION and/or METHOD attribute. The ACTION attribute specifies what program on the server to execute when form 305 is submitted for processing. The METHOD attribute describes how data entered into the form is handled. There are two METHOD choices: GET and POST. Each one of them is capable of processing the data entered into a form. To illustrate, if a user named Bill who resides at 130 West Union Street, Pasadena, Calif. 91103 with a credit card number of 1111272733334141 completes the Name Address, and Credit Card number fields shown on form 305 by filling in spaces 345-347, then the following data string 350 is created and sent to web server 320.
user_name=Bill&user_address=130 West Union StreetPasadena CA 91103&user_cc=1111222233334444This data string may be handled by either the GET method or the POST method. However, in FIG. 3 it is handled by the POST method. The GET method appends information entered into form 305 by the user onto the end of a Uniform Resource Locator (URL 335) or in a QUERY_STRING environment variable. URL 335 is submitted to web server 320 by web client 330 in an HTTP request 315. Web server 320 is responsible for processing the submitted information. A problem with the GET method is that the information contained in URL 335 is not stored on web client 330 for later use and cannot therefore be later used to populate form 305. With the POST method, the information placed into form 305 by the user is placed into the data stream of the HTTP protocol. This allows web server 320 to process form 305 data using the standard input data stream to a separate program or script on the server. Standard input is a method for providing a program or script with data that is well know to programmers skilled in the art. The POST method also does not provide a way to store data on web client 330 for later use.
The ACTION attribute of the FORM tag indicates where the computer program tasked with processing data string 350 resides. For example, the following portion of code block 310 specifies that a computer program (e.g. a PERL script) named form_processor.pl is located in cgi-bin 321 of sever 320.
<FORM METHOD=POST ACTION=http://server320/form_processor.pl>The form_processor.pl program is responsible for processing data string 350. Programs that utilize the Common Gateway Interface (CGI) standard are typically located in a common directory (e.g. cgi-bin 321). CGI is a specified standard for communicating between HTTP servers and server-side gateway programs. A CGI program is capable of storing data supplied by the user, but it cannot determine whether the user asking for the form is the same user that previously submitted data to the CGI program. If the user has never visited the web site before, the CGI program does not have any information about that user. CGI programs can collect information about users who visit a web site, but such programs cannot collect information about users who have never visited the web site.
The program “form_processor.pl” used in the above example is an example of a server-side program that processes data input by the user via form 305, for example. When form 305 is submitted web server 320 executes the “form_processor.pl” program and passes the data that was entered by the user and then sent from web client 330 to server 321. That is, data string 350 is passed to “form_processor.pl” when a form is submitted. When the gateway program is done processing data string 350 the result is sent to server 320 and a response may be forwarded back to web client 330. Gateway programs can be compiled programs written in languages such as C or C++ or they can be executable scripts written using a scripting language such as PERL, TCL, or various other language. Once data string 350 is processed it may be stored on server 320 in data store 322. A problem with using CGI programs is that since they are located on the server, information cannot easily be saved on the web client for later use.
Thus, existing mechanisms for processing forms reside on the server and are unable to populate a form on the client with user data from users that have never visited the web site having the form (e.g. the data is only available to web servers the user has previously visited).
HTTP Cookies
One example of data that may be stored on the client is a cookie. A cookie is an HTTP header that consists of a text-only string. The text-string is entered into the memory of a web client and accessible by the web browser. This text-string is initially set by the computer serving the web page and consists of the domain name, path, lifetime, and value of a variable that the server sets. If the lifetime of this variable is longer than the time the user spends at that site, then the text-string is saved on the web client for later use. As a result, cookies provide a way to store and later recall values entered into a web page by the user. For example, cookies allow a web server to individually customize a web page for each user who visits the page.
A cookie may be used to populate the elements of a form that a user has previously completed. For example, referring now to FIG. 9, if a user completes field 945-947 of form 905 and submits it to server 920 via submission path 915, server 920 can place the data submitted by the user in cookie 931 by sending web client 910 data via path 925. If the user returns to form 905 again Web browser 930 executing at web client 910 may use the data saved in cookie 931 populate fields 945-947 with the data previously entered by the user. Thus, a cookie can free the user from having to complete form 905 more than once. Cookies can be used to fill forms the user has previously visited. However, the data that is stored by the cookie cannot be used to populate other forms. That is, a problem with cookies is that forms located on different servers are not provided access to the same cookie. Only web servers that are listed as having permission to access a particular cookie may obtain such access.
A cookie is only accessible to the server that set, or created it. Thus, a second form cannot use the first form's cookies to obtain the user's name and address. For example, the web server located at www.merchantA.com may be allowed to read a cookie, but the web server located at www.merchantB.com may not be allowed to read the cookie created by www.merchantA.com
Referring now to FIG. 4, the process used to create and retrieve a cookie is illustrated. Initially, web browser 401 which resides on web client 400 issues a request 410 for web page 415 from web server 450. Web server 450 responds by sending a copy of web page 415 to web client 400 via response 411. Web client 400 uses web browser 401 to display web page 415 to the user. To create a cookie 460, web server 450 sends a “Set-Cookie” command in the header line contained in response 411. For example, in response to a request 410, web server 450 may send the following command in HTTP response 411:
Set-Cookie: NAME=VALUE; expires=DATE; path=PATH;domain=DOMAIN_NAME; secure.The NAME and VALUE parameter contain the information included in the cookie. DATE is the time at which the cookie expires and is thus no longer saved on web client 210. DOMAIN is the host address or domain name for which the cookie is valid. For example, if a cookie is set by a computer having a domain name of www.merchantA.com, only the server responsible for that particular domain name is provided access to the cookie. The PATH parameter specifies a subset of URL's at the appropriate domain for which the cookie is valid. If the keyword secure is used then the cookie is only transmitted over a secure connection. All of these parameters except the NAME=VALUE are optional to set a cookie. Once the Set-Cookie command is sent to web client 400 a cookie 460 is placed on web client 400.
To create a cookie 460 using the UNIX operating system, for example, web server 450 could execute the following shell script in response to a request 410:
#!/bin/shecho Content-type: text/htmlecho Set-cookie: FooBar=foo; expires=Wednesday,02-11-99 12:00:00 GMTechoecho <H1>A Cookie named Foobar containing the textfoo is now present on your computer</H>This script stores “FooBar=foo” on web client 400. The following script allows web server 450 to read the HTTP cookie:
#!/bin/shecho Content-type: text/htmlechoecho The data supplied here was obtained from acookie:<P>echo $HTTP_COOKIEOnce cookie 460 is created, the information stored in the cookie can be used in many different ways. Information in cookie 460 may be accessed by a script that has authorization to insert cookie 460 data into a form, for example. However, a problem with cookies that no authentication mechanism is required to obtain the information they store. A cookie is a text file and is not stored in encrypted form. While a browser may limit access to a cookie, the text that is stored in the cookie may be accessed or read by any program that can read a text file. Therefore, it is not wise to store any sensitive data in a cookie. An additional problem with cookies is that users do not have to enter a user name and password to obtain access to the cookie.Predefined Paste Operation
Some web browsers provide users with a mechanism for pasting predefined data into a form. Under the generic preferences menu of the Opera web browser, for example, the user can select a command button labeled personal information. When this command button is selected a dialog box that contains fields for entering personal information opens. If the user desires, the user can enter a name, an address, a phone number, an e-mail address, and/or any other kind of information into the text fields of the dialog box.
Once the user enters information into the fields of the dialog box and clicks “OK” the information that was entered can be pasted into the fields of a form by performing a right click operation. For example, if the user encounters a form that has a place for entering an address, the user can elect to paste the address previously specified in the dialog box by first selecting the address field and then right-clicking on a pointing device such as a mouse and selecting from a drop down menu the item titled “insert address”.
A problem with pasting information into the fields of a form in this manner is that the user is required to manually select the place to insert the information and the type of information to insert into that place. Another problem is that unauthorized users are not prevented from accessing the information entered into the dialog box.