This invention relates generally to the field of data entry systems and, more particularly, to automated word completion systems for operating with unstructured data files, such as word processing documents and e-mail messages.
General purpose digital computers are widely used for a large variety of text-based applications, including word processing, e-mail, spreadsheets, personal calendars, etc. To use the computer for one of these purposes, a user typically types on a keyboard to enter text and commands into an active data file, which is open within an application program running on the computer. Other text input devices include a voice recognition interface, a touch-sensitive screen overlaid on top of a graphical image of a keyboard, or a system that detects the motion of a pen in combination with handwriting recognition software. The text and commands are then interpreted and manipulated by the application program in accordance with the syntax and functionality implemented by the application program.
For many users, the most time consuming computer activity is the entry of large amounts of text into various data files, such as word processing files and e-mail files. Regardless of the input method used, the speed at which the text can be entered into the computer is a major factor governing the user""s efficiency. The designers of text-intensive application programs have therefore developed text-input aids to assist users in entering text into the computer.
A word prediction system is an example of such a text-input aid. Generally stated, a word prediction system predicts and suggests complete data entries based on partial data entries. This allows the user to type in a partial data entry and then accept a predicted word completion with a single keystroke, thus avoiding the keystrokes that would have been required to type the complete data entry. For example, a word prediction system may be configured to recognize a user""s name so that the user""s complete name, xe2x80x9cDean Hachamovitchxe2x80x9d for instance, may be predicted after the user types the first few letters, xe2x80x9cDeaxe2x80x9d in this example.
Creating word prediction systems that exhibit acceptable memory-use and performance characteristics, and that are not overly disruptive or annoying to the user, is an on-going challenge for software developers. Three techniques have traditionally been used to meet this challenge: (1) organizing the user""s document into structured fields; (2) restricting the data space used to predict word completions; and (3) requiring the user to request a word prediction when desired. As the drawbacks associated with each of these techniques are described below, it will become clear that there is a continuing need for word prediction systems that automatically predict unrestricted word completions for data entries in an unstructured portion of a data file, such as the body of a word processing document or e-mail message.
Because there are a limited number of words available in any given language, many of the words forming the vocabulary of the language are used frequently. This is particularly true for data files that include structured fields for certain data entries, such as the xe2x80x9cfromxe2x80x9d and xe2x80x9ctoxe2x80x9d fields of an e-mail message, or the xe2x80x9cpayeexe2x80x9d and xe2x80x9camountxe2x80x9d fields of a bank check. A structured field supplies a context for data to be entered into the field. This context can be used to limit the choice of word predictions for the field, and increase the likelihood that a suggested word completion is correct. Word prediction systems therefore work well for structured data fields because the choice of words used in a particular structured field can often be sufficiently limited so that the word prediction system can offer reasonably likely suggestions within acceptable memory-use and performance characteristics.
Most-recently-used (MRU) text completion has been deployed in connection with structured data fields to speed text entry and also serve as a memory aid for repetitive data entries. These word prediction methods use an MRU data entry list for each structured field to provide a list of word prediction choices for the field. That is, a list of the most recent items entered into the structured field is used to suggest word completions for partial data entries entered into the field. For example, a personal finance program may maintain a record of a person""s previous bank checks. In order to speed entry of the check payee on a new check, the program keeps an MRU list of prior check payees. This list is used to automatically suggest a completion for the payee name after the first few letters of the payee have been typed by the user. For instance, if a user has previously written checks to xe2x80x9cGeorgia Power,xe2x80x9d the complete data entry xe2x80x9cGeorgia Powerxe2x80x9d may be suggested after the letters xe2x80x9cGexe2x80x9d have been typed into the check payee field.
In MRU word prediction systems, an input character may be analyzed, with respect to the prior history of text entered, to predict the text likely to follow the input character or string of characters. Because MRU word prediction systems are based upon a prior history of text entered, the search time and amount of storage required for the systems are important parameters. Either a linear or a binary search is typically used to scan the text history in order to provide a text prediction. A linear search operates by sequentially examining each element in a list until the target element is found or the list has been completely processed. Because every entry must be analyzed, linear searches are primarily used with very short lists.
A binary search locates an item by repeatedly dividing an ordered list in half and searching the half that it is known to contain the item. This requires a value for the input that can be compared against values in a list of items arranged in a known sequence, such as ascending numerical order corresponding to alphabetical placement. The binary search begins by comparing the input value against the value in the middle of the list. If the input value is greater than the middle value, the lower half of the list is discarded and the search concentrates on the upper half. The input value is again compared with a value in the middle of the new list and again half of the list is discarded. The process continues, with the input value being compared against the middle of each succeeding smaller list, until the desired item is found.
Both linear and binary searches can require substantial time to complete, particularly for large search lists. MRU word prediction systems therefore tend to be costly in terms of computation resources and performance. Also, without a mechanism for increasing the likelihood of making a correct prediction, such as structured fields in the input data file, the word prediction system may make wrong predictions so often that the system may be perceived as more annoying than useful. For this reason, MRU word prediction systems have typically been deployed in connection with structured fields.
Restricting the search field using a limited word prediction data space, such as a known data range or naming syntax, is another approach to improving the performance of a word prediction system. For example, a spreadsheet program may use the data entries in adjacent rows and columns as a limited data space list for selecting word prediction choices when the user is entering a new heading into the spreadsheet. Similarly, an editing program for software development may use a predefined list of valid function and command names as a limited data space for selecting word prediction choices when the user is writing a software program. Or a filing system may use the list of previously-created file names as a limited data space for selecting word prediction choices when the user is selecting a file. Of course, these limited-data-space word prediction systems only work well when there is a limited and well-defined data space to use for selecting word predictions. They are not well suited to automatic application for all data entries in an unstructured portion of a data file because, in this situation, there is not a readily apparent limited and well-defined data space to use for selecting word prediction choices.
Dictionary-based word prediction systems, such as those found in spell-checking utilities, have also been used in prior word prediction systems. With a dictionary-based word prediction system, the user must activate the spell-checking utility to obtain a suggested spelling for a particular data entry. It would be very disruptive if the spell-checking user interface automatically popped-up with a list of suggested words every time the user entered a data entry that the spell-checker construed as a misspelled word. In addition, the suggestions provided by the spell-checking utility do not typically change based on the context of the data entry, such as a structured field in the data file. Instead, the dictionary-based word prediction system provides the same spelling suggestions regardless of any contextual information that may be ascertained regarding a data entry. Conventional dictionary-based word prediction systems would therefore be overly disruptive if automatically applied for all data entries in an unstructured portion of a data file.
Prior word prediction systems have additional shortcomings when deployed in the multiple-application-program environment that exists on most computer systems. Computer systems often allow for multiple application programs to run simultaneously. For example, a word processing application program, an e-mail application program, and a personal calendar program may all run simultaneously on a typical computer system. User interfaces for these application programs typically appear in different windows displayed on a display screen. The user selects one window at a time to receive input, and then inputs text and commands into the selected window using the keyboard or another text input device.
The word prediction systems discussed above are usually deployed on an individual application program basis. That is, each word prediction system is typically customized to work only with one particular application program. For example, the check writing word prediction system discussed previously works only with the check writing application program, and not with other application programs, such as a word processor or e-mail program running on the same computer system. This causes wasteful duplication of software when similar word prediction systems are implemented by several different application programs. Duplication of items stored in memory can also result. For example, duplicate items may be stored in memory when several different applications keep separate MRU histories or dictionaries. Another problem is that repetitive data entries cannot be identified across several application programs. As a result, the user may have to xe2x80x9cteachxe2x80x9d several word predictions systems the same set of commonly-used data entries, such as the user""s name, address, business name, etc.
Thus, there is a need in the art for a word prediction system that automatically predicts unrestricted word completions for data entries in an unstructured portion of a data file, such as the body of a word processing document or e-mail message. There is a further need for a text prediction system that may operate with multiple application programs with little or no application-specific programming.
The present invention is a word completion system that can automatically predict unrestricted word completions for data entries in an unstructured portion of a data file, such as the body of a word processing document or email message. The word completion system applies prediction criteria to avoid annoying the user by displaying an excessive number of wrong suggestions. Suggested word completions, which may change as the user types a partial data entry, are displayed in a non-disruptive manner and selected using traditional acceptance keystrokes, such as the xe2x80x9ctabxe2x80x9d key or the xe2x80x9centerxe2x80x9d key.
The word completion system may be deployed on an individual application program basis or on an application-independent basis. Application independence is the ability of the same word completion system to work with several different application programs, such as a word processing program, an e-mail program, a spreadsheet program, and so forth. Because different word suggestion lists may be appropriate for different application programs, and for different data files within the same application program, the word completion system allows the user to select one or more suggestion lists for use with each data file. In addition, the individual entries of a word completion list may be limited so that they are only used in certain context-based situations. These context-based limitations effectively allow each word completion list to be subdivided into a group of context-sensitive lists.
A word completion user interface allows the user to customize each suggestion list with user-defined name-completion pairs on an on-going basis. Each suggestion list may also contain certain word completions that are tied to dynamic parameters maintained by the computer system, such as the time, date, registered user, etc. Each suggestion list may also be limited to name-completion pairs in which the completion entries have a predefined property, such as initial letter capitalized, all letters capitalized, occurring at the start of a paragraph, occurring at the end of a paragraph, and so forth. Each suggestion list may also be limited to name-completion pairs that are tied to contextual information, such as structured data fields or context labels assigned manually or by a document-creation aid known as a xe2x80x9cwizard.xe2x80x9d
Generally stated, the invention is a computer-readable medium having computer-executable instructions for running a word completion utility on a computer system. The word completion utility monitors data entry into a data file associated with a program module running on the computer system. The word completion utility identifies a partial data entry in an unstructured portion of the data file, such as the body of a word processing document or e-mail message. The word completion utility selects a suggestion list including a plurality of associated name-completion pairs, each name-completion pair including a name entry and a completion entry. The word completion utility identifies a particular one of the name entries in the suggestion list that corresponds to the partial data entry. The word completion utility then applies prediction criteria to the particular name entry, the particular completion entry, and the partial data entry. If the prediction criteria are met, the word completion utility displays the associated completion entry as a word completion suggestion for the partial data entry. Advantageously, the suggestion list, as well as name-completion pairs within the suggestion list, may be specified by the user.
The word completion utility may then receive a command indicating acceptance of the completion entry. In response, the word completion utility replaces the partial data entry with the completion entry in the data file. The word completion utility may then identify a character immediately following the command indicating acceptance of the completion entry. In response, the word completion utility determines whether the character is a delimiter character. If the character is a not a delimiter character, the word completion utility inserts a space character in the data file between the completion entry and the character.
According to an aspect of the invention, a suggestion list may limited to name-completion pairs in which the completion entries have a predefined property, such as initial letter capitalized, all letters capitalized, occurring at the start of a paragraph, occurring at the end of a paragraph, and so forth. In addition, the partial data entry may be received in a portion of the data file that has been assigned a context label. In this case, a particular suggestion list may be associated with the context label. For example, a document-creation aid known as a xe2x80x9cwizardxe2x80x9d may assign paragraph style labels to the various paragraphs in a business letter. Thus, the greeting paragraph may be assigned a xe2x80x9cgreetingxe2x80x9d context label, the body paragraphs may be assigned a xe2x80x9cbodyxe2x80x9d context label, and the complimentary closing paragraph may be assigned a xe2x80x9ccomplimentary closingxe2x80x9d context label. This allows the suggestion list for the complimentary closing paragraph, for instance, to be limited to a relatively small set of conventional complimentary closing phrases, such as xe2x80x9cSincerely yours,xe2x80x9d xe2x80x9cVery truly yours,xe2x80x9d xe2x80x9cCordially yours,xe2x80x9d and the like.
According to another aspect of the invention, the completion entry may be tied to a dynamic parameter maintained by the computer system, such as the current date, the current time, or the registered user of the computer system. This allows a current date name entry, for example, to be tied to the computer system""s clock. Thus, the current date, xe2x80x9cJune 26, 1997,xe2x80x9d for instance, may be automatically suggested whenever the user enters the first few letters of the corresponding month, xe2x80x9cJunxe2x80x9d in this case.
According to yet another aspect of the invention, the prediction criteria includes a first condition that the partial data entry include a certain number of characters. The prediction criteria may also include a second condition that the completion entry include a certain number of characters more than the partial data entry. The prediction criteria may further include a third condition that the partial data entry unambiguously correspond to the particular name entry with respect to all of the name entries in the suggestion list. The prediction criteria increases the likelihood that each word completion suggestion will be correct, which avoids annoying the user with an excessive number of wrong suggestions.
That the invention improves over the drawbacks of prior word prediction systems and accomplishes the advantages described above will become apparent from the following detailed description of the exemplary embodiments and the appended drawings and claims.