This invention relates generally to computer-implemented processing of data entry forms, such as HTML-generated forms on Internet web pages. More particularly, the invention provides a method and apparatus for automatically populating data fields in forms using data values previously specified by a user.
Computer systems conventionally display forms with fields into which a user enters information such as a name, birth date, password, and the like. Modern browsers display forms by rendering Hyper Text Markup Language (HTML) to generate fields arranged in a particular structure that can be populated by a user. Web sites that accept shopping orders from on-line customers, for example, generate forms requiring entry of the customer""s name, address, telephone number, and credit card information. Usually, the user must repeatedly enter this information each time a site is visited. Although information entered by the user is stored on the web site, the form does not retain the information for future use if the web site is revisited.
Some web sites can recognize previous customers and thus avoid re-prompting for the same information on a subsequent visit. Nonetheless, if the user visits a new web site that he or she has never before visited, the same information must be re-entered on a different form generated by the different web site. Much of the information requested on these forms is redundant or readily available from other sources (e.g., name and address), yet the creators of different forms generally have no easy way to share information previously entered by the user on an earlier form. Privacy issues have thwarted many potential solutions to this problem, and it is cumbersome for web site designers to include special logic on their web site to recognize previous visitors to the site.
So-called xe2x80x9ccookiesxe2x80x9d (small data files stored by a web site on the user""s local computer) are sometimes used to retain information locally that can be recalled later by a web site that the user has previously visited. Such xe2x80x9ccookies,xe2x80x9d however, vary widely from site to site, and require cumbersome programming logic on each web site to implement them. Moreover, users can block the storage of these cookies, and users may be generally suspicious of their use by untrusted web sites.
One attempt to solve some of these problems was a prior art feature included in the Microsoft Internet Explorer 4.0 product known as a xe2x80x9cprofile assistant.xe2x80x9d This feature made it easier for web sites to retrieve registration and demographic information from users who had previously provided that information. Frequently used information such as user name, address, and the like was stored securely in protected storage on the client computer. Web servers could request to read this information, but it was shared only if users gave their consent in a pop-up request box each time a site was visited.
While the profile assistant provided a potential solution to the aforementioned problems, in practice it required that each web site write script to request information from the user""s stored information. If the user declined to grant permission to share the information, the solution was effectively thwarted. It was also inconvenient and time consuming for the user to complete a full profile and store it on the user""s machine. Finally, some users viewed the function as intrusive because it required immediate user input to confirm that the feature should be enabled each time a web site was visited.
A prior art data schema known as the xe2x80x9cvCardxe2x80x9d schema has been used for certain frequently referenced data fields across application programs. This schema established standardized field identifiers that were to be used for the same data fields, and was intended to facilitate the transfer of personal profile information among applications. For example, the field identifier xe2x80x9cvCard.FirstNamexe2x80x9d was reserved as a field identifier for storing a user""s first name, regardless of the form or application program into which the user""s name was to be entered. (The user would typically only see a label such as xe2x80x9cFirst Name.xe2x80x9d) This schema does not, however, address the aforementioned problems. As one example, it is difficult to force millions of web sites to conform to standard field identifiers or to retrofit existing web pages to the existing schema.
The prior art provides tools to suggest previously used values to a computer user when prompting the user for information. For example, some e-mail programs suggest possible recipient names in the xe2x80x9ctoxe2x80x9d field which match previously stored user names. When the user types the first character of a recipient""s name, a possible choice that matches the first character appears in the field. As another example, well-known Internet browsers provide a user with a pull-down menu of choices in a browser""s address field, such that the user can review previously used web site addresses in order to select an address.
These conventional techniques, however, suffer from many of the same disadvantages as the aforementioned solutions. The application program itself (i.e., the e-mail program) must be specially modified to support the feature, and previously used field values cannot be shared among other application programs on the same computer unless those applications are also modified. Moreover, all application programs would need to adopt standard field identifiers in order for the scheme to work properly.
For most web forms there is no deterministic way to associate a given field label with its corresponding a text entry area (i.e., labels used on web forms are not linked to field identifiers on the page). For example, a web page that displays the word xe2x80x9cNamexe2x80x9d next to a text input box invites the user to enter his or her name into the text input box. However, there is no easy way for software reading the text entry box to associate the xe2x80x9cNamexe2x80x9d label with the text input box, since field labels are not intrinsically linked to other field attributes such as field name and data type. Consequently, while field labels might provide an attractive basis for correlating similar fields across forms, there is no easy way for the underlying software to identify and correlate these labels with values entered by the user.
In summary, Internet web pages containing form fields create special problems, because each web site defines the format and behavior of its own forms, and there is no easy way to share or suggest previously entered data values across different web sites or servers. Moreover, because of privacy concerns, sharing previously entered form values for different web sites may be undesirable or even impossible in many cases.
The present invention overcomes many of the foregoing problems by providing a method and apparatus that assists a user in storing into a profile data values entered on a form on the basis of labels associated with fields on the form. When the user displays a form having the same or similar field labels, a matching process suggests data values for fields on the form. According to one embodiment, the user can initiate this process by clicking on an xe2x80x9cautofill formxe2x80x9d button.
A web browser employing various inventive principles heuristically associates a label with a corresponding text input box and, based on the association, populates the text input box with a previously stored field value. The match can be performed on the basis of a dictionary of strings representing common labels in order to retrieve previously used values. An algorithm uses a hierarchical searching method to intelligently choose likely label candidates for a given field on a web page and confirms that choice by matching it against a previously stored dictionary of potential field labels. Other features include the ability to match fields with a multi-lingual dictionary; color coding of automatically populated field values for ease of use; and an initial profile creation step that matches values extracted from a populated form to a basic set of field labels.